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Abstract 


This thesis is concerned with system identification and control of nonlinear systems using 
neural networks. The work has been carried out keeping two objectives in mind. First, to 
design training algorithms for neural networks which are simple, efficient and capable of 
being implemented in the real time. Second, to design viable neural network controllers 
for nonlinear and underactuated systems. 

Recurrent neural networks are capable of learning dynamic nonlinear systems where 
complete information about the states are not available. Memory Neuron Networks, a 
special class of RNN, has been used for identifying SISO as well as MIMO systems. 
The weights are adjusted using Back Propagation Through Time (BPTT). To increase the 
modeling accuracy, two other algorithms namely. Real Time Recurrent Learning (RTRL) 
and Extended Kalman Filtering (EKF) have been proposed for MNN. Simulation experi- 
ments show that RTRL provides best approximation accuracy at the cost of large training 
time and large training set. EKE gives comparable approximation accuracy with signifi- 
cant reduction in the number of presentations required as compared to RTRL. 

A novel algorithm based on Lyapunov stability theory has been proposed for weight 
update in feedforward networks. Interestingly, the proposed algorithm has a parallel with 
the popular back propagation (BP) algorithm. It is shown that fixed learning rate in BP 
could be replaced by an adaptive learning rate which is computed using Lyapunov func- 
tion approach. It is shown that a modification in the Lyapunov function can lead to smooth 
search in the weight space thereby speeding up the convergence. Through simulation re- 
sults on various benchmark problems, it is established that the proposed algorithm out- 
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perform both BP and EKF algorithms in terms of convergence speed. Certain system 
identification issues are also analyzed for this algorithm 

Some of the recent and widely known neural network based controllers have been 
analyzed in detail. Two existing algorithms namely NN based robust backstepping con- 
trol and singular perturbation technique have been used to control various kinds of robot 
manipulators including flexible link and flexible joint manipulators. A neural controller 
based on partial feedback linearization has been proposed for pendubot. The simulation 
results show a promise that neural networks can be used for this class of underactuated 
mechanical systems which is yet to be tested through hardware implementations. 
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Chapter 1 


Introduction 

1.1 Artificial Neural Network 

Artificial neural networks (ANNs) are computational paradigms which implement sim- 
plified models of their biological counterparts, biological neural networks. Biological 
neural networks are the local assemblages of neurons and their dendritic connections that 
form the (human) brain. Accordingly, ANNs are characterized by 

• Local processing in artificial neurons (or processing elements, PEs), 

• Massively parallel processing, implemented by rich connection pattern between 
PEs, 

• The ability to acquire knowledge via learning from experience, 

• Knowledge storage in distributed memory, the synaptic PE connections. 
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The attempt of implementing neural networks for brain-like computations like patterns 
recognition, decisions making, motor control and many others is made possible by the 
advent of large scale computers in the late 1950’s. Indeed, ANNs can be viewed as 
a major new approach to computational methodology since the introduction of digital 
computers. 

Although the initial intent of ANNs was to explore and reproduce human information 
processing tasks such as speech, vision, and knowledge processing, ANNs also demon- 
strated their superior capability for classification [1], pattern recognition and function 
approximation problems. Neural networks have recently emerged as a successful tool 
for identification and control of dynamical systems [2, 3]. This is due to Jhe computa- 
tional efficiency of the back propagation algorithm [4, 5] and the versatility of three layer 
feedforward network in approximating static nonlinear functions. 


1.2 System Identification 

System identification can be described as building good models of unknown systems 
from measured data. Identification of a system has two distinct steps: (i) choosing a 
proper model and (ii) adjusting the parameters of a model so as to minimize a certain fit 
criterion. Since dynamical systems are described by differential or difference equations 
in contrast to static systems that are described by algebraic equations, the choice of proper 
neural network is cmcial. If all the states of the systems are available for measurement, 
then a multilayer perceptron (MLP) is sufficient to model the system. But in many cases, 
where all the states are not available for measurement or complete knowledge of dynamics 
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are not known, feed forward networks can not be used. Recurrent neural networks (RNN) 
with internal memories are capable of identifying such systems. Thus the choice of proper 
network model is very cracial in sytem identification problems. As for the second step of 
identifcation, error back propagation algorithm based on gradient descent is very popular 
for both feed forward as well as recurrent networks [6, 5]. Slower convergence of BP 
has encouraged researchers to come up with various faster convergence algorithms [7, 8, 
9]. Some of the popular algorithms in this category are Lavenberg-Marquardt [10, 11], 
Conjugate gradient [12], Extended Kalman Filtering [13, 14]. These algorithms, although 
fast, are computationally intensive and require huge amount of memory. Finding weight 
update algorithms, which are simple and efficient in terms of computation and memory, 
is still a promising research area. 

1.3 Neural Networks in Nonlinear control 

The class of nonlinear systems that has been dealt with in this thesis include robot ma- 
nipulators [15]. Robot manipulators are characterized by complex nonlinear dynamical 
structures with inherent unmodelled dynamics and unstructured uncertainties. These fea- 
tures make the designing of controllers for manipulators a difficult task in the framework 
of classical adaptive or non-adaptive control. Simulations and experimental results of a 
number of researchers such as Narendra and Parthasarathy [2, 16], Chen and Kbalil [3] 
and many others in early 90’s confirmed the applicability of neural networks in the area 

I 

of dynamic modeling and control of nonlinear systems. Some other works in this field 
include the feedback linearization using NN by Narendra and Levin [17], dynarnic neural 
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network for input-output linearization by Delgado et al [18], network inversion control 
design by Behera et al. [19]. What these works signify is that there is a growing need of 
algorithms that can be implemented in the real time. 

Robust and adaptive controls have been used extensively to control robot manipula- 
tors [20, 21, 22]. These techniques make certain simplifying assumptions which reduce 
their applicability. Moreover computational requirement goes up even with moderately 
complex systems with many uncertain parameters. Recent literatures in robust and adap- 
tive control report a lot of applications of some new schemes like back-stepping [21, 23] 
and singular perturbation [24]. The application of neural networks has been able to re- 
move some of the limitations of classical adaptive techniques, thus broadening the class 
of systems that can be solved by these NN based controllers. Kwan et al. [25] have 
demonstrated neural network based robust backstepping control for various kinds of non- 
linear systems. Similarly, Lewis et al. [26] have demonstrated various Neural Network 
based controllers for robot manipulators. Their work have shown that simple 2-layer 
feedforward networks can be used for taking into account inherent nonlinearities and pa- 
rameter variations in the system thus avoiding cumbersome computations required for 
determining regression matrices. Moreover, these controllers can be used online owing 
to the simplicity of their architecture and learning methodologies. 

Underactuated mechanical systems (UAMS) are a different class of nonlinear systems 
where the number of actuators is always less than the number of degrees of freedom. The 
possibility of reducing the number of actuators to control a system has attracted a lot 
of attention towards UAMS. Some of the well known underactuated systems are inertia 
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wheel pendulum, pendubot, acrobot and furuta pendulum. Robot manipulators with flex- 
ible joint and flexible links are also regarded as underactuated because of the reduced 
control effectiveness. Fantoni et al. [27] have proposed various energy and passivity 
based controllers for UAMS. Spong [28] have demonstrated a method based on partial 
feedback linearization for acrobot while Block [29] has demonstrated a similar method 
for pendubot. Quite recently, Saber [30] has suggested various methods for underactuated 
mechanical systems. Infact, he, for the first time, provides a classification of the under- 
actuated systems and suggested different control schemes for different classes. Use of 
neural networks and other machine learning approaches for these kind of problems have 
not been reported so far in the literature. It seems to be an open field for research. 

1.4 Brief overview of the thesis 

This thesis consists of two parts. The first part deals with system identification with 
Neural Networks while the later part deals with design of neural network based controllers 
for nonlinear systems like robot manipulators. 

Three different learning algorithms namely BPTT, RTRL and EKF have been used 
for training a recurrent network model (MNN) to identify both SISO as well as MEMO 
systems and a performance comparison has been made. EKP has been used for training 
MNN for the first time. 

A new algorithm based on Lyapunov stability theory has been proposed for training 
feedforward networks and a performance comparison has been made with two existing 
popular algorithms. 
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A detailed analysis has been done on the recent neural network based adaptive and 
robust control techniques for various robot manipulators. This will give direction to my 
future research endeavours. Some existing algorithms like neural network based robust 
back stepping and singular perturbation techniques have been used to control various 
robot manipulators. 

A neural network based controller has been designed for pendubot which is yet to be 
tested on a hardware set up. 


1.5 Memory neuron networks 

As mentioned earlier, recurrent Networks can be used for identifying dynamic networks, 
but deriving algorithms for weight update is quite complex because of the presence of 
feedbacks. Sastry et al [31] showed that by adding a temporal element to each neuron, 
one can achieve the functional capability of a recurrent network while preserving the 
simplicity of feedforward networks. They called it Memory Neuron Network (MNN). 
They proposed an algorithm based on Back-Propagation for updating weights. Some 
extension to their work has been done. RTRL [6] and EKP [13] have been proposed for 
this architecture of RNN. The efficiencies and usefulness of these three algorithms have 
been compared. These algorithms have been tested on both single-output and double- 
output systems and their performance has been assessed through various simulations. 
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1.6 Lyapunov based learning algorithm for feedforward net- 
works 

The training time and number of training examples required by a particular algorithm has 
always been a matter of concern. A training algorithm that requires minimum training 
exemplars and minimum computational time is always desirable. An effort has been 
made to come up with an algorithm which outweighs some of the best known algorithms 
for feedforward networks. Use of lyapunov functions for finding out suitable control 
input for any system is quite popular and has been demonstrated by Behera et al [32, 19]. 
Extending the same concept to neural networks, an weight update algorithm (LF-I) based 
on Lyapunov stability theory [33] has been derived. The adaptive learning rates of LF 
algorithm have been analyzed and this gives us the clue to avoid local minima during 
training. A modification in LF-I has also been suggested which is found to speed up the 
error convergence. A performance comparison has been made with two existing popular 
algorithms namely BP and EKF on three benchmark problems. System identification 
issues have also been discussed for the proposed algorithm. 

1.7 Neural controllers for robot manipulators 

Some of the recent NN based control techniques for robot manipulators have been im- 
plemented. These include robust back stepping control [25] and singular perturbation 
technique [26] for both flexible link as well as flexible joint manipulators. A neural net- 
work controller based on partial feedback linearization has been designed for pendubot. 
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Through extensive simulations, the performance of these controllers have been analyzed. 


1.8 Organization of the thesis 

Chapter 2 is concerned with Memory Neuron Networks and its training. 

In Chapter 3, a new training algorithm based on Lyapunov function has been proposed 
and analyzed. 

Chapter 4 concerns with classical and neural network based adaptive controls. 

In chapter 5, two existing techniques namely back-stepping and singular perturbation 
have been used to control various robot manipulators. 

In chapter 6, a Neural Network based controller based on partial feedback lineariza- 
tion is suggested for pendubot. 

Chapter 7 gives concluding remarks and future direction of research. 



Part I 

System Identification with Neural 

Networks 
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Chapter 2 


Identification Of Nonlinear 
Dynamical Systems Using Recurrent 
Neural Networks 

2.1 Introduction 

A recurrent network model with internal memory is best suited for identification of sys- 
tems for which incomplete or no knowledge about its dynamics exists. In this sense, 
Memory Neuron Networks (MNN) [31] offer truly dynamic models for identification of 
nonlinear dynamic systems. The special feature of these networks is that they have inter- 
nal trainable memory and can hence, directly model dynamical systems without having 
to be explicitly fed with past inputs and outputs. Thus, they can identify systems whose 
order is unknown or systems with unknown delay. Here each unit of neuron has, associ- 
ated with it, a memory neuron whose single scalar output summarizes the history of past 
activations of that unit. The weights of connection into memory neuron involve feedback 
loops, the overall network is now a recurrent one. 

The primary aim of this chapter is to analyse the different learning algorithms on the 
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Figure 2.1; Structure of Memory Neuron Model 

basis of modeling accuracy and computational complexity. The weights of MNN [31] are 
adjusted using a BPTT update algorithm. To increase the modeling accuracy two other al- 
gorithms namely RTRL [6] and EKF [13] have been proposed. It is concluded that RTRL 
identifies the system more efficiently when modeling accuracy as well as computational 
complexity are taken into account. 

2.2 Memory Neural Network 

In this section, the structure of the network is described. The network used is similar to 
the one that is described in [31]. The architecture of a Memory Neuron Model is shown 
in figure 2.1. At each level of the network, except the output level, each of the network 
neurons has exactly one memory neuron connected to it. The memory neuron takes its 
input from the corresponding network neuron and it also has a self feedback. This leads 
to storage of past values of the network neuron in the memory neuron. All the network 
neurons and the memory neurons send their output to the network neurons of next level. 
In the output layer, each network neuron can have a cascade of memory neurons and each 
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of them send their output to that network neuron in the output layer. 

2.2.1 Dynamics of the network 

The following notations are used to describe the functioning of the network. 

L is the number of layers of the network with layer 1 as the input layer and layer L as 
the output layer. 

Ni is the number of network neurons in layer L 

x^j (k) is the net input to the jfth network neuron of layer I at time k. 

(fc) is the output of the jth network neuron of layer I at time k. 

Vj{k) is the output of the memory neuron of the jth network neuron of layer I at time 
k, KL. 

w\j{k) is the connecting weight from the ith network neuron of layer I to the jxh net- 
work neuron of layer Z + 1 at time k. 

flj (k) is the connecting weight from the memory neuron of the ith network neuron of 
layer I to the jth network neuron of layer Z + 1 at time k. 

a^j{k) is the connecting weight from the jth network neuron to its corresponding mem- 
ory neuron. 

06 fj{k) is the connecting weight from the {j — l)th memory neuron to the jth memory 
neuron of the ith network neuron in the output layer at time k. 
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{k) is the output of the jth memory neuron of the ith network neuron in the output 
layer at time fc. 

P(j{k) is the connecting weight from the jth memory neuron of the ith network neuron 
in the output layer at time k. 

Mj is the number of memory neurons asociated with the jth network neuron of the 
output layer. 

g{.) is the activation function of the network neurons. 

The net input to the jth network neurons of layer Z, 1 < Z < L, at time k is given by 

N1-.1 

x]{k + 1) = Y^ w[j\k) s’r\k) + X] fli\k)vt\k) (2.1) 

i=0 i=l 

In the above equation, we assume that 5 q(/c) = 1 for all I and Wqj is the bias for the j*^ 
network neuron in the layer I + 1. The output of the network neuron is given by 

s^j(k + 1) = g(x^j(k + 1)),1<1<L (2.2) 

The activation function used for hidden and output nodes are gi and g 2 respectively. They 
have following form 



(2.3) 


(2.4) 
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Here ci, C 2 ,ki and are the parameters of the activation function. The net input to the 
jth network neuron in the output layer is given by 

x^{k+ 1 ) 

The output of all the memory neurons except for those in the output layer, are derived as 
follows: 

v\{k + 1 ) = a]{k)s\{k) + (1 - a]{k))v]{k) ( 2 . 6 ) 

For the memory neurons in the output layer, 

v^^{k + 1 ) = + (1 - ( 2 . 7 ) 

where by notation, we have Vio = sf . To ensure stability of the network dynamics, we 
impose the condition that 0 < < 1. 

2.3 Learning Algorithms 

Different learning algorithms to be used for the MNN are described here. At each instant, 
an input is supplied to the system and the output of the network is calculated using the 
dynamics of MNN. A teaching signal is then obtained and is used to calculate the error 
at the output layer and update all the weights in the network. The usual squared error is 




Mj 

+ ^Pfi{k)vfi{k) 

i=l 
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used and is given by 


m=J^{sfik)-yjik)f (2.8) 

j=l 

where, yj{k) is the teaching signal for the output node at time fc. 

2.3.1 Back Propagation Through Time Algorithm 

The training algorithm using back propagation through time [5] for a recurrent net is 
based on the observation that the performance of such a network for a fixed number of 
the time steps N is identical to the results obtained from a feed forward net with 2N 
layers of adjustable weights. The final equation for updating the weights are given below; 

wlj{k + 1) = w\j{k) - r]^j^^ik)s\{k) 1 <l < L (2.9) 

where rj is the step size and local gradient term 5j is given by 

5f{k) = isfik) - y^ik))9\xf{k)) ' (2.10) 

Nl+1 

Sj(k) = 9'(x;-(k)) ^ Sj+^(k)wjp(k) 1<1<L (2.11) 

The above is the standard back propagation of error without considering the memory neu- 
rons. The updating of / is same as that of w except that the output of the corresponding 
memory neuron is used rather than the network neuron. 

fl-(k + l) = f^(k)-vS}-^\k)vj(k) l<r<L (2.12) 
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The various memory coefficients are updated as given below: 


a]ik + 1) - a]ik) ~ ,1<1<L (2.13) 

ri P dv — 

c§(k + 1 ) = alj(k) - r,'-^lk)-^{k) ( 2 . 14 ) 

l3tj(k+l] = l3f-(k)-n'St(k)v^j(k} . ( 2 . 15 ) 


where 


dE 


ATi+i 


3 5=1 


dv''k 




— {k) =Sj(fc — 1) —Vj{k~l) 


riP 

^(k) = l3lj(k)iHk) 


3v^' 

^(/c) = u§_i(A: - 1) - vj:^{k - 1) 


( 2 . 16 ) 


( 2 . 17 ) 

( 2 . 18 ) 

( 2 . 19 ) 


Two step size parameters are used in the above equations namely, r]' for memory coeffi- 
cients and 77 for remaining weights. To ensure the stability of the network, we project the 
memory coefficient back to the interval ( 0 , 1 ), if after above updating they are outside the 
interval. 


2.3.2 Real Time Recurrent Learning Algorithm 

This algorithm [ 6 ] can be mn on line, learning while sequences are being presented rather 
than after they are complete. It can thus deal with sequences of arbitrary length, there are 
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no requirements to allocate memory proportional to the maximum sequence length. The 
notations used in this algorithm are as follows: 


^’S#+i) = 


d$\,{k + 1 ) 
dwfJk) 


— + 1 )) 


!^l-i 


dKjik) dw^^ik) 


r=0 


( 2 . 20 ) 


+ 1) = + 

« ( 2 . 21 ) 

(1 - 4(fc))(5i'’/):) 

ICfi + 1 ) = 

' (2.22) 

where 1 < n < L and 1 < Z < L. It may be noted that these values at time fc = 0 are 

initialised to zero. Also depending on the plant equation, these values can be re-initiahsed 
to zero after a particular number of time steps. The learning rule for this algorithm is 
derived as follows: 

BE 

wfjik + l)=wfjik)-rj-^^^j^ (2.23) 

(* + 1) - «P('‘ + + 1) 
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where 


+ 1) = /(ajp (fc + 1)) 

M, ( 2 . 25 ) 

s=l 

The update of / is same as that of w except that the output of the corresponding memory 
neuron is used rather than the network neuron. The various memory coefficients are 
updated as in previous algorithm with the difference that the learning is real time. 

2.3.3 Extended Kalman Filter Algorithm 

The Extended Kalman Filter uses second order training that processes and uses informa- 
tion about the shape of the training problem’s underlying error surface. Williams et al. 
[34, 35] have provided a detailed analytical treatment of EKF training of the recurrent 
networks, and suggested a four to six fold decrease relative to RTRL in the number of 
presentations of the training data for some simple finite state machine problems. EKF 
is a method of estimating state vector. Here the weight vector a{t) is considered as the 
state vector to be estimated. The MNN can be expressed by following nonlinear system 
equations for input. 

ai{t) — ai{t - 1) (2.26) 

y<^ = h[ai{t)]+e{t) (2.27) 

Here is the desired output, y{t) is the estimated output vector at time (t-1) and the 
approximation error e(t) is assumed as the white noise vector with covariance matrix R(t). 
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The covariance matrix is unknown a priori and has to be estimated. For this purpose, R{t) 
is assumed to be a diagonal matrix AJ. The initial state a(0) is assumed to be a random 
vector. The following real time learning algorithm [13] is used to update the weights. 

&i it) =ai{t-l)+ Ki it) [y<^ - yit)\ (2.28) 


where K, the Kalman filter gain is given by 


K.{t) - -n(t - m (01/ - ^ (2.29) 


A(t) = A{t - 1) + ~ (t) »(<)) _ A(t - 1)1 (2.30) 

■^max 


Piit) = Pi(f - 1) - Piit - l)Kiit)Hiit) 


(2.31) 


where 




(2.32) 


Note that all the Pi coefficients for the corresponding weights are initialised to unity. 


2.4 MNN For Modelling Of Dynamical Systems 

A series-parallel model is obtained (for a SISO plant) by having a network with two input 
nodes to which we feed u(fc) and ypik — 1). The single output of the network will be 
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Figure 2.2: System identification Model 
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yp{k). This identification system is shown in Figure 2.2. 

Vpik) = F{u{k),u{k-l),yp{k-l),.,.,yp{k),...) (2.33) 

To model an m-input and p-output plant, a network with m + p inputs and p outputs will 
be used. This will be the case irrespective of the order of the system. The actual outputs 
of the plant at each instant are used as teaching signals. 

2.4.1 Simulation 

Two examples of nonlinear plant are identified by MNN. Series-parallel model (Fig- 
ure 2.2) is used for identification. The networks with only one hidden layer are used. 
So, the notation m : n is used to denote a network that has m hidden network neurons 
and n memory neurons per node in the output layer. SISO plant has two inputs u{k) and 
yp{k — 1) and output yp{k). The number of inputs to the identification model does not 
depend on the order of the plant. 

Network parameters: The network size used for all examples and algorithms is 6:1. 
The same learning rate is used for all problems with 77 = 0.2 and 77/ = 0.1. Also the 
same activation functions gi for hidden nodes and p2 for output nodes are used with 
Cl = C2 = A;i = ^2 = 1. Attenuation constant is used in the plant output so that the 
teaching signal for the network is always in [-1,1]. 

Training the network: 77000 time steps are used for training the network. The net- 
work is trained for 2000 iterations on zero input; then for two-third of remaining training 
time, the input is independent and identically distributed (iid) sequence uniform over [- 
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2,2] and for rest of the training time, the input is a single sinusoid given by 5m(7r^). 
After the training, the output of the network is compared with that of the plant on a test 
signal for 1000 timesteps. For the test phase, the following input is used. 

k 

u{k) = 5m(7r— ), k < 250 
25 

= 1.0, 250 < A: < 500 

= -1.0, 500 </c<750 

k k 

= 0.3sm(7r— ) + 0.l5m(7r— ) + 

25 32 

k 

0.6sm(7r— ), 750 <k< 1000 (2.34) 


Example 1: This indicates the ability of the MNN to learn a plant of unknown order. 
Here the output of the plant is given as below: 


yp{k + 1) = fiyp{k),yp{k - l),yp{k - 2),u{k),uik - 1)) (2.35) 


where 


fixuX2,X3,X4,X5) 


XIX2XZX5{X3 - 1 ) + 3:4 
1 + + Xg 


(2.36) 


Example 2: This is a MIMO plant with two inputs and two outputs. The plant is 
specified by: 


ypi(A: + l)-0.5[^^^2^^^^+ni(A:)] 

(2.37) 


(2.38) 
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Figure 2.3: Plant and network output with BPTT algorithm 



Figure 2.4: Plant and network output with RTRL algorithm 


The identification is found to be good for the MIMO plant also. 


Example 

BPTT 

RTRL 

EKF 

Ex.No.l 

0.013293 

0.006088 

0.006642 

Ex.No.2 o/pl 

0.005753 

0.001414 

0.002595 

Ex.No.2 o/p2 

0.008460 

0.001258 

0.001563 


Table 2.1: Mean Square Error while identifying with MNN 


The examples described have been simulated using all the algorithms discussed. For 
example 1, figures 2.3, 2.4, 2.5 give the outputs of the plant and the model network. 
The plant and network outputs for example 2 is shown in the figure 2.6. The mean square 
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Figure 2.5: Plant and network output with EKF algorithm 

errors for all the algorithms for both examples are shown in the Table 2.1. 

2.5 Summary 

Memory Neuron Networks offer truly dynamical models. The memory coefficients are 
modified online during the learning process. Here the network has a near to feed-forward 
structure which is useful for having an incremental learning algorithm that is fairly ro- 
bust. We can consider MNN to be a locally recurrent and globally feed-forward architec- 
ture that can be considered as intermediate between feed-forward and general recurrent 
networks. 

The Back Propagation through time (BPTT) algorithm is not an online training pro- 
cess but the real time recurrent learning algorithm is an online training algorithm with 
very good identification properties. The Extended Kalman Filter is a fast algorithm and 
shows comparable identification capabilities. It can be concluded from the graphs and 
error obtained in the previous section that EKF algorithm is the best suitable for mod- 
eling while the approximate gradient descent is the least favourable. The complexity of 
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Output - 1 Output - 1 Output - 



(a) BpTT: first output 



(c) RTRL: first output 


0 200 400 600 800 

Data Points 

(e) EKF: first output 




(d) RTRL: second output 



(f) EKF: second output 




Figure 2.6: Simulation results for example 2 
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computation increases from BPTT to RTRL to EKF. f 

By introducing dynamics directly into the feed forward network stmcture, MNN rep- 
resents an unique class of dynamic model for identifying any generalized plant equation. 
From the extensive simulations of different algorithms carried out and observing the re- 
sults obtained, we conclude that EKF is one of the best learning algorithms for this model. 
But, from the above discussion we find that the complexity of calculations involved in- 
creases with the decrease in error. So, future research in this field will hopefully lead to 
improvement in that direction. 





Chapter 3 

An Adaptive Learning Algorithm 
for Feedforward Networks using 
Lyapunov Function Approach 


3.1 Introduction 

This chapter is concerned with the problem of training a multilayered feedforward neural 
network. Faster convergence and function approximation accuarcy are two key issues in 
choosing a training algorithm. The popular method for training a feedforward network 
has been the back propagation algorithm (BP) [1, 16]. One of the shortcomings of this 
algorithm is the slow rate of convergence. A lot of research has been done to accelerate 
the convergence of the algorithm. Some of the approaches use ad-hoc methods like using 
momentum while others use standard numerical optimization techniques. The popular 
optimization techniques use quasi-Newton methods. The problem with these methods 
are that, their storage and memory requirements go up as the square of the size of the 
network. The nonlinear optimization technique such as the Newton method [12, 7] - con- 
jugate gradient descent - have been used for training. Though the algorithm converges 
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in fewer iterations than the BP algorithm, it requires too much computation per pattern. 
Extended Kalman filtering (EKF) [13] and recursive least square (RLS) [8] based ap- 
proaches which are also popular in nonlinear system identification have been proposed 
for training feedforward networks. Among other algorithms, Lavenberg-Marquardt (LM) 
algorithm is quite well-known [10, 9, 11] for training feedforward networks; although 
such algorithms are also computationally expensive. 

In this chapter, we have proposed a very simple and easy to implement algorithm 
based on Lyapunov stability criterion to train a feedforward neural network. Interest- 
ingly, the proposed algorithm has exact parallel with the popular BP algorithm except 
that the fixed learning rate in BP algorithm is replaced by an adaptive learning rate. This 
algorithm was earlier used by Behera et al. [32, 19] for network inversion and controller 
weight adaptation as well. Yu et al. [36] have proposed a backpropagation learning 
framework for feedforward neural network and showed that all other algorithms like LM, 
GD, Gauss-Newton etc. are special cases of this generalized framework. They have also 
made use of lyapunov stability theory to derive the update law. They have choosen a dif- 
ferent lyapunov function and they have not explored the nature of algorithms on different 
problems. Our work provides a better insight into the working of the algorithm. Three 
bench-mark functions, XOR, 3-bit parity and 4-2 Encoder are taken to study compara- 
tive performance of the proposed algorithm with BP and EKF. A system identification 
problem is also considered for testing function approximation accuracy. Lastly, we com- 
pared our algorithm with BP on a 2-D Gabor function [37] approximation problem which 
provides more insight into the working of this algorithm. 
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Figure 3.1: A Feed-forward Neural Network 


3.2 Lyapunov Function (LF I) Based Learning Algorithm 


A simple feedforward neural network with single output is shown in figure 3.1. The net- 
work is parametrized in terms of its weights which can be represented as a weight vector 
W G • For a specific function approximation problem, the training data consists of, 
say, N patterns, {a:^, = 1, 2, N. For a specific pattern p, if the input vector is 

then the network output is given by 


f = f{W,x^) p=l,2,...Ar (3.1) 

The usual quadratic cost function which is minimized to train the weight vector W is: 

E = (3.2) 
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In order to minimize the above cost function, we consider a Lyapunov function for the 
system as below: 

V = \{y^y) ( 3 . 3 ) 

where y — [y^ — y^ — y^, , — y'^]^- As can be seen, in this case the 

Lyapunov function is same as the usual quadratic cost function minimized during batch 
update using back-propagation learning algorithm. The time derivative of the Lyapunov 
furiction V is given by 

V = -y^W = -y^JW (3.4) 

where 

T — T c dNxM 

■’-gw 

Theorem 3.1 If an arbitrary initial weight W (0) is updated by 


Wit’) = W(0) H- 



Wdt 


(3.5) 


where 


W = 



(3.6) 


then y converges to zero under . the condition that W exists along the convergence trajec- 
tory. 


proof: Substitution of equation (3.6) into equation (3.4), we have 


V = -\\y f < 0 ' (3.7) 
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where V < 0 for all y 0. If F is uniformly continuous and bounded, then according to 
Barbalat’s lemma [23] as f — > 00 , 1/ — > 0 and y — > 0. The weight update law given in 
equation (3.5) is a batch update law. Analogous to instantaneous gradient descent or BP 
algorithm, the instantaneous LF I learning algorithm can be derived as: 


W = 



(3.8) 


where y = y^~y^ & R and Ji = G . The difference equation representation 

of the weight update equation based on equation (3.8) is given by 


Wit + 1 ) 


Wit) + yWit) 


(3.9) 

(3.10) 


Here yU is a constant which is selected heuiistically. We can add a very small constant e 
to the denominator of equation (3.6) to avoid numerical instability when error y goes to 
zero. Now, we compare the Lyapunov function based algorithm (LF I) with the popular 
BP algorithm based on gradient descent principle. In gradient descent method we have. 


AW 


dE 

^dW 


= yJi^y 


(3.11) 

(3.12) 


Wit + l) = Wit) + r]Ji^y 


(3.13) 
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Figure 3.2: Adaptive Learning rates for LF-I: The four curves correspond to four patterns 
of the XOR problem. The number of epochs of training data required can be obtained by 
dividing the number of iterations by 4. 

where rj is the learning rate. Comparing equation (3.13) with equation (3.9), we see a 
very interesting similarity where the fixed learning rate rj in BP algorithm is replaced by 
its adaptive version rja- 

Earlier, there have been many research papers concerning the adaptive learning rate [38]. 
However, in this chapter, we formally derive this adaptive learning rate using Lyapunov 
function approach, and is a natural key contribution in this field. For training XOR func- 
tion, we plot adaptive learning rate in figure 3.2. It should be noted that as training con- 
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verges, this adaptive learning rate rja goes to zero as expected. In simulation section, we 
will show that this adaptive learning rate makes the algorithm faster than the conventional 
BP. 


3.3 Modified Lyapunov function (LF II) Based Learning Algo- 
rithm 


In this section, we modify the Lyapunov function considered in LF I algorithm so that 
weight update rule can account for smooth search in the weight space. A possible Lya- 
punov function candidate for smooth search can be as follows: 


V = ^{y^y + XW^W) 


( 3 . 15 ) 


where W = W — W. W is the ideal weight vector and W is the actual weight vector. 
The variable y is as defined in LF 1. The parameter A is a constant. The purpose of adding 
a second term will be revealed later in the section. The time derivative of Lyapunov 
function is given by 


= -y^{J+D)W 


( 3 . 16 ) 

( 3 . 17 ) 
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where J = : N x M Jacobian matrix, W = - W and 




(3.18) 


Theorem 3.2 If an arbitrary initial weight is updated by 


W(t') = W(0) + 



(3.19) 


where W is given by: 


W = 


II y f 

II {J + D^y 




(3.20) 


then y converges to zero under the condition that W exists along the convergence trajec- 
tory. 


proof: Substituting for W from (3.20) into (3.16), we have 


i^ = -||y f<o 


(3.21) 


where V < 0 for all y ^ 0 and F = 0 iff y = 0. As derived in LF I, the instantaneous 
weight update equation using modified Lyapunov function can be finally expressed in 
difference equation model as follows: 


Wit + l) = W{t)+{p- 


y f 


{Ji + Df-y 


r, 112 


)iJi + Dfy 


(3.22) 
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To draw out a similar comparison between LF-II and back-propagation, we consider the 
following cost function, 

E=^{fy + XW'^W) (3.23) 

Using gradient-descent. We have, 

dE 

= n{Ji + Dfy (3.25) 

W{t + l) = W{t) + rjiJi + Dfy ' (3.26) 


Comparing equations (3.22) and (3.26), the adaptive learning rate in this case is given by 


iJi + D)^ 


1I2 


(3.27) 


The adaptive learning rate for LF-II in case of XOR problem is shown in figure 3.3. In 
equation (3.15), we define W to be the difference between last two consecutive weight 
values. By doing so, we are putting a constraint on the variation of weights. We will show 
in the simulation that this term has the effect of smoothing the search in the weight space 
thereby speeding up the convergence. This also helps in achieving uniform performance 
for different kind of initial conditions which is unseen in case of BP. 
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Figure 3.3: Adaptive Learning rates for LF-II. The four curves cdrrespond to four patterns 
of the XOR problem. The number of epochs of training data required can be obtained by 
dividing the number of iterations by 4. 

3.4 Simulation Results 

A two-layered feedforward network is selected for each problem. Unity bias is applied 
to all the neurons. We test proposed algorithms LF I and LF 11 on three bench-mark 
problems, XOR, 3-bit Parity and 4-2 Encoder and a system-identification problem. The 
proposed algorithms are compared with popular BP and EKF algorithm. All simulations 
were carried out on an AMD Athlon (2000 XP) machine (1.6GHz) running on Linux 
(RedHat 9.0). For XOR, 3-bit Parity and 4-2 Encoder problems, we have taken unipo- 
lar sigmoid as our activation function while for system-identification problem we have 
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Algorithm 

epochs 

time(sec) 

paranieters 

BP 

mmm 


77 = 0.5 

BP 

HIb^I 

■lliltW 

rj = 0.95 

EKF 



A = 0.9 

LF-I 


103391 


LF-n 


10009 

H = 0.55, A = 0.01 


Table 3.1: comparison among three algorithms for XOR problem 


choosen bipolar sigmoid activation function for neurons. The patterns are presented se- 
quentially during training. For benchmark problems, the training is terminated when the 
mean square error per epoch reaches 10"'^. Since the weight search starts from initial 
small random values, and each initial weight vector selection can lead to different con- 
vergence time, average convergence time is calculated for fifty different mns. Each run 
implies that the network is trained from any arbitrary random weight initialization. In 
Back Propagation algorithm, the value of learning rate r) is taken to be 0.95. It is to be 
noted that in usual cases, learning rate for BP is taken to be much smaller than this value. 
But in our case, the problems being simpler, we are able to increase the speed of conver- 
gence by increasing this learning rate. We have deliberately done this to show that the 
proposed algorithms are still faster than this. The initial value of A in EKF is 0.9. The 
value of constant p in both LF I and n is selected heuristically for best performance. Its 
value lies between 0.2-0.6 and A in LF-II (3.15) is kept at a low value (=0.01-0.1). It 
simply suggests that we are giving less weightage to the second term in that equation. 


3.4.1 XOR 


For XOR, we have taken 4 neurons in the hidden layer and the network has two inputs. 
The adaptive learning rates of LF-I and LF-II are shown in figure 3.2 and 3.3 respectively. 
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Algorithm 

epochs 

time(sec) 

parameters 

BP 


0.483 

ry =5 0.5 

BP 

5941 

0.2408 

77 = 0.95 

EKF 

2186 


A = 0.9 

LF-I 

796 



LF-II 

403 

0.0986 

H = 0.45, A = 0.01 


Table 3.2: comparison among three algorithms for 3-bit Parity problem 


It can be seen that the adaptive learning rate becomes zero as the network gets trained. 
The simulation results for XOR is given the table 3.1. It can be seen that LF-I and LF- 
n takes minimum number of epochs for convergence as compared to BP and EKF. We 
find that LF-II is nearly 10 times faster than the BP algorithm. Also, we see that LF-II 
performs better as compare to LF-I as far as training time is concerned. The figure 3.4 
gives a better insight into the performance of various networks. 


3.4.2 3-bit Parity 

For Parity problem, we have choosen a network with 3 inputs and 7 hidden neurons. Table 

3.2 shows the simulation results for this problem. Here also we find that LF-I and LF-II 
outperform both BP and EKF in terms of convergence time. In this case, we find that 
LF-n is nearly five times faster as compared to BP. EKF might be faster for a particular 
choice of initial condition but on an average it is slower as compared to BP in terms of 
computational time. The improved performance of LF-II over LF-I can be seen clearly 
from the figure 3.5. 
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50 


(a) XOR 


(b) 3“bit Parity 



(c) 4-2 Encoader 


Figure 3.5: Comparison of convergence time in terms of iterations between LF I and LF 





Algorithm 

epochs 

time(sec) 

parameters 

BP 

2104 

0.3388 

77 = 6.5 

BP 

1141 

0.1848 

77 = 0.95 

EKF 

1945 

2.4352 

0 

11 

-< 

LF-I 

81 

0.1692 

11 = 0.29 

LF-n 

70 

0.1466 

fj, = 0.3, A = 0.62 


Table 3.3: comparison among three algorithms for 4-2 Encoder problem 


3.4.3 4-2 Encoder 


T'or encoder, we take a network with four inputs, two outputs and 7 hidden neurons. The 
simulation results are shown in the table 3.3. Here also, we find that LF-I and LF-II 
perform better than BP and EKF. 


3.4.4 System Identification problem 

We consider the following system identification problem [19, 32] 

The same feedforward 2-layer network given in figure 3.1 with 7 hidden neurons is used. 
It has two inputs x{k) and u{k) respectively. BP, LF-II and EKF methods are used to train 
the network. The input u{k) is randomly varied between 0 and 1 and 40000 training data 
patterns are generated. The training data are normalized between -1 and 1. After training, 
the network is presented with a sinusoidal input u{k) = sin{t) for three periodic cycles 
as test data. The neural network response and the actual model response is compared for 
the test data in figure 3.6. The rms error for BP, LF-II and EKF are of order 0.0539219, 
0.052037, 0.0807811 respectively. The capability of EKF for system identification is 
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Rms em)P=0.0539219, Tmax=4CI000 


Rms eiTonsO.080781 1, Tmax=40000 




(a) BP; rms eiTor=0.0539219 (b) EKF: rms error=0.080781 1 


Rms ciTor=0.052037, Tmux=40000 



(c) LF-II: rms eiTor=0.052037 

Figure 3.6: System identification - the trained network is tested on test-data 
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Algorithm 

Hidden neurons 

rms error/run 

parameters 

BP 

40 

0.0939937 

1 — I 

o 

II 

r-T 

i 

! 

BP 


0.0444109 

??1,2 = 0.1 

LF-I 

|||||[||||Qg||H 

0.0359601 

11 

o 


40 

0.0421814 

fj, = 0.4, A = 0.1 


Table 3.4: Performance results for Gabor function 


quite well known. LF also shows comparable and even better approximation capabilities. 
It should also be noted that LF-II requires very less computation time in comparison 
to EKF for approximation. It is also noted that if we increase the number of learning 
iterations, BP and EKF saturates while LF n improves its prediction. 


3.4.5 2-D Gabor Function 

The convolution version of complex 2-D Gabor function has the following form 


g{xi,X2) = ^ 


(3.29) 


where A is an aspect ratio, cr is a scale factor, and uq and vq are modulation parameters. 
In this simulation, the following Gabor function is used. 

+ X2)) (3.30) 

The above function is shown in figure 3.7. We tried to train a 3-layer feedforward network 
to approximate this function. We started with random initial values of weights and found 
that none of the above three algorithms is able to converge. However, for few initial 
conditions, LF algorithm converge. Now, we took a radial basis function network and 
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g(xi.x2) 


2'D Gabor Function 



Figure 3.7: A 2-D Gabor function 


applied both BP and LF algorithm for training. 5000 training data sets and 10000 test 
data sets were taken for this network. The rms errors were calculated for 20 different runs 
where each run corresponded to a different initial condition. The results are summarized 
in the table 3.4. This table shows that LF algorithms have better function approximation 
capabilities as compared to BP. 


3.5 Summary 

We have proposed a novel algorithm for weight update in feedforward networks using 
Lyapunov function approach. The key contribution of this work is to show a parallel be- 
tween proposed LF I algorithm and popular BP algorithm. We showed that the fixed learn- 
ing rate rj in popular BP could be replaced by an adaptive learning rate rja = f/^TrJ^iW) 
which can be comprehensively computed using Lyapunov function approach. Thus a 
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natural way to improve faster convergence of the popular BP algorithm has been demon- 
strated. It was also shown that a modification in Lyapunov function can lead to smooth 
search in the weight space. Through simulation results on three bench mark problems, we 
establish that proposed LF I and LF n outperform both popular BP and EKF algorithms 
in terms of convergence speed. We also demonstrated the dynamic system identification 
capabilities of the proposed algorithm using two function approximation problems and 
compared it with the conventional BP. 



Part II 

Neural Controllers 
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Chapter 4 


Adaptive and Neural Network 
controllers, An analysis 


4.1 Introduction 

Adaptive controllers are usually used for controlling plants with uncertain parameters. 
These parameters are estimated and then used to calculate control input for the plant. 
Model reference Control and Self tuning controllers are two categories of adaptive con- 
trollers. In MRAC, the controller is so designed that the plant with controller indtates 
the performance of a reference model. In STC, controller is designed so that the closed 
loop systems gives the desired performance. An estimator which estimates the unknown 
parameters, is a part of the controller. Neural Network based adaptive controllers have 
certain advantages over conventional controllers. It can take into account not only pa- 
rameter uncertainty but also unmodeled system dynamics. A comparison among these 
three approaches has been made for a single-link manipulator problem. More complex 
problems have been addressed in next chapters. 
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4.2 Model Reference Adaptive control of a single link manip- 
ulator 


The plant model is given by 


ml^q 4- mglsin{q) == u (4.1) 

I 

where m is mass and I is length of the manipulator arm and g is acceleration due to 
gravity. The equation (4.1) may be rewritten as 

aq + bsin{q) = u (4.2) 

where a = ml^ and b = mgl are assumed to be unknown quantities. In case, these 
parameters are known, it is easy to compute a control input u such that tracking objectives 
are achieved. The purpose of an adaptive control is to estimate the values of parameters 
a and 6 in a recursive fashion, simultaneously achieving tracking convergence. 

Let’s consider a sliding surface, 

s = e + Aoe = q — qr (4.3) 

where qr — Qd^ 

Choose a control input as follows: 

u == dqr — K$ + bsin{q) ' (4.4) 
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where a and b are estimates of a and b respectively. The closed loop system is given by 

aq 4- bsin{q) = d^V — Ks + bsin{q) (4.5) 

Subtracting aqr from both sides of equation (4.5), we get 

a{q — 4V) = (d — a)qj^ — Ks + (6 — b)sin{q) 

as -h Ks = aqr + dsin{q) (4.6) 


where a = a — a and b = b — b. 


The response of the closed loop system can be written as 


s = 


1/a 


p + k/a 


{aqr + bsin{q)) = H{p){dqr + bsin{q)) 


(4.7) 


where p is the laplace variable and H{p) = There is a basic lemma stated below 

which can be used to compute adaptation laws for MRAC systems [24]. 

Lemma 4.1 Consider two signals e and (p related by the following dynamic equation 


e{t) = H{jp)[k4F{t)v{t)] (4.8) 


where e{t) is a scalar output signal, H{p) is a strictly positive real (SPR) transfer func- 
tion, k is an unknown constant with known sign, (p{t) is am x 1 vector of function of 
time, and v{t) is a measurable m x 1 vector. If the vector (j> varies according to 


P{t) = —sgn{k)jev{t) 


(4.9) 



with 7 being a positive constant, then e{t) and (j>{t) are globally bounded. Furthermore, 
if V is bounded, then 

e{t) -^0 as t oo 

In equation (4.7), H{p) is SPR. Hence, by lemma (4.1), following update law 


-JSQr 

(4.10) 

—jssin{q) 

(4.11) 


will ensure that a 0 as t -H' oo. The convergence of this update law can be proved 
by taking a suitable lyapunov function and finding its time derivative. For simulation, we 
have take m = 1, Z = 1 and g = 9.8. The simulation results are shown in Fig. (4.1). 


4.3 Self Tuning Control Based on Least Square Method 

Usually, the parameters of a controller are computed from the plant parameters. In case, 
plant parameters are not known, it is reasonable to replace them by their estimated values 
as provided by a parameter estimator. A controller obtained by coupling a controller 
with an online (recursive) parameter estimator is called a self-tuning controller. Thus, 
a self-tuning controller (STC) is a controller which performs simultaneous identification 
of the unknown plant. Figure 4.2 illustrates the schematic structure of such an adaptive 
controller. 
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Control Torque (Nm) Link Position (rad) 




(a) Link Position (b) Link Velocity 




(c) Control input (d) Parameter estimation 


Figure 4.1: Adaptive control of Single Link Manipulator 








Figure 4.2: A self tuning controller 


4.3.1 Parameter Estimation 

When there is parameter uncertainty in a dynamic system (linear or nonlinear), one way 
to reduce it is to use parameter estimation, i.e, inferring the values of parameters from the 
measurements of input and output signals of the system [24]. Parameter estimation can 
be offline or online. Off-line estimation is preferable if the parameters are constant and 
there is sufficient time for estimation before control. In case of slowly time varying pa- 
rameters, online parameter estimation is necessary to keep track of the parameter values. 
The essence of parameter estimation is to extract parameter information from available 
data concerning the system. A quite general model for parameter estimation application 
is in the linear parametrization form 


y{t) = W{t)a (4.12) 

where the n-dimensional vector y contains the outputs of the system, the m-dimensional 
vector a contains unknown parameters to be estimated, and the n x m matrix W{t) is a 
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signal matrix. It is to be noted that both y and W (t) are required to be known from the 
measurements of the system signals. 


4.3.2 Standard Least Square Estimator 

In the standard least-square method[24], the estimate of the parameters is generated by 
mininndzing the total prediction error 

J = [ \\y{r)-W(r)a{t)\\'^dr (4.13) 

Jq 

with respect to d(t). Since this implies the fitting of all past data, this estimate potentially 
has the advantage of averaging out the effects of measurement noise. The estimated 
parameter d satisfies 

[^W^Wdr]a= W^ydr (4.14) 

Jo Jo 

which is obtained from = 0. Define 

P(t) = [ r W^{r)W{r)dr]^^ (4.15) 

Jo 

To achieve computational efficiency, it is desirable to compute P recursively. So the 
above equation is replaced by a differential equation 

^[p-\t)] = W^it)W{t) (4.16) 

at 
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Differentiating equation (4.14) and using equations (4.15) and (4.16), we find that the 
parameter update satisfies 

a = —P{t)W"^ei (4.17) 

where ei — Wa — Wa = y — y and P{t) is the estimator gain matrix. By using the 
identity, 

we obtain following weight update equation 

P = -PW'^WP (4.18) 


In using (4.17) and (4.18) for online estimation, one has to provide an initial parameter 
value and an initial gain value. P(0) should be chosen as high as allowed by the noise 
sensitivity, d should be initialized with some finite value. 

Parameter Convergence 

From (4.16), (4.17) and (4.18), one can easily show that 

p-\t) = p-\0)+ rw'^{r)W{r)dr (4.19) 

Jo 

l[p-\t)a{t)] = 0 


Thus, 


a{t) = F(t)F-^(0)a(0) 


(4.20) 



if W is such that 


^min [ r W'^Wdr] —^oo as t —> oo (4.21) 

Jo 

where Aynini-] denotes the smallest eigenvalue of its argument, then the gain matrix con- 
verges to zero, and the estimated parameters asymptotically converge to the true param- 
eters. The condition (4.21) is satisfied if W is persistently exciting and P 0 and 
d 0. 


4.3.3 Simulation Example 

For simulation, we consider the plant (4.2) with a = 1 and b = 9.8. The model for 
parameter estimation is expressed as 

u{t) = W'^{t)z{t) (4.22) 

where W{t) = [q sin{q)Y' and z — [a 6]^ and u{t) is the control input to be esti- 
mated. The desired control input is similar to (4.4) and is rewritten as 

u(t) = aqd + bsin{q) -f K^e -1- Kde (4.23) 

The estimation error is ei = u — u and the tracking error is e = qd — q. The parameter 
vector a is updated using (4.17) while the estimation gain matrix P{t) is updated using 
(4.18). The simulation results are shown in figure (4.3). 
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4.4 Neural Network based Adaptive Controller 


As the number of unknown parameters increase, conventional adaptive control laws be- 
come more and more computationally intensive. By using neural networks one can get 
away with explicit estimation of parameters in order to compute control input for the 
plant. The uncertainty of parameters may be considered as unknown dynamics and a NN 
can be used to identify this uncertainty. Kwan et al [39, 25] have developed a weight 
update algorithm for Neural network that ensures the stability of closed loop system dy- 
namics. 


4.4.1 Problem definition and stability analysis 

Consider the plant model (4.2) which is reproduced below for convenience 

aq + bsin{q) = u 

Let’s define e = qj, — q, e = qd — q and r = e + Ae, then we have 

ar = a[qd — q + Ae] 

= a{qd 4- Ae) - aq 
= aiQd + Ae) + bsin{q) - u 

ar = F{qd, q,e)-u (4.24) 
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The equation (4.28) becomes 


V = -r'^Kr + m\\r\\tr{w'^W) 

= -r^Kr + m\\r\\tr{w'^ {W - W)) 


By using Cauchy-Schwarz Inequality 


tr{W^W - W^W) < IIWIIfII WIIf - liwill- 


and assuming bounded weights || W||i? < Wm, we get 


^ < -Aminllrf + m||r|l||W||i.(WM-||W||F) 
< -||r-ll[Ammllr|| +m(l|Wl|i. - 


where A^m is the minimum eigenvalue of K, V is negative if the quantity inside the 
square bracket in the above equation is positive. This gives the condition 


r > m 


wh 


4A„ 


(4.30) 


or 


\\W\\f > Wm 


(4.31) 
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Thus, V is negative outside a compact set. The control gain K can be selected large 
enough so that 


m 


Wlf 

4 Atm in 


< Br 


Hence, this demonstrates that both ||r|| and li WHi? are UUB. 


4.4.2 Simulation Results 

The simulation results are shown in figure (4.4). Following observations can be made 
from the simulation 

• As can be seen in figure (4.4d), neural network is able to approximate the nonlinear 
function with uncertain parameters quite closely. This relieves us from deriving 
explicit update algorithm for each parameter which may become more and more 
tedious as the complexity of the problem increases. 

• No restrictive assumptions like LIP (Linear in Parameters) are required for deriving 
the control law. So it is a more general method which can be used to control a 
broader class of nonlinear systems. 

• Control inputs are bounded. 

• For this problem, NN gives performance comparable to a high gain PD controller. 

But as we will see in next chapter, NN based controller give better performance for 
more complex problems. ^ ^ ^ 

'.‘.ifes.A lisiii:. 
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(a) Link Position 



(c) Control input 



(e) Position tracking error 
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(b) Link Velocity 



(d) Function approximation 



(f) PD control 


Figure 4.4: Neural Network based Adaptive Controller for single link Manipulator 
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4.5 Summary 


Two adaptive control techniques and one NN based control technique are used to solve 
a single-link manipulator problem and a comparative study has been made among them. 
We found that NN removes some of the restrictive features of adaptive control and thus 
helps in broadening the class of problems that can be solved by adaptive controllers.. 
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Chapter 5 

Neural Network controllers for 
Robot Manipulators 

5.1 Introduction 

Most commercially available robot controllers implement some variety of PID control 
algorithm. As performance requirements on speed and accuracy of motion increase, PBD 
controllers lag further behind in providing adequate robot manipulator performance. In 
the absence of any adaptive or learning capability, controllers might loose accuracy when 
uncertain parameter changes. 

Robust and Adaptive controllers [15, 24] have been applied successfully to robot 
manipulators. A serious problem in using adaptive control in robotics is the requirement 
for the assumption of linearity in unknown system parameters: 

fix) = Rix)^ (5.1) 

where fix) is a nonlinear robot function, R(x) is a regression matrix of known robot func- 
tions and ^ is a vector of unknown parameters (e.g. masses and friction coefficients). This 
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is an assumption that restricts the classes of systems amenable to control. Some forms of 
friction, for instance, are not linear in parameters (LIP). Moreover, this LIP assumption 
requires one to determine the regression matrix for the system; this can involve tedious 
computations, and a new regression matrix must be computed for each different robot 
manipulator. Some of these problems can be remedied by using Neural Networks. 

Neural Networks possess some very important properties, including a universal ap- 
proximation property [40] where, for every smooth function f{x), there exists a neural 
network such that 

fix) = W^aiV^x) + e (5.2) 

for some weights W, V. This approximation holds for all a; in a compact set S, and the 
functional estimation error e is bounded so that 

II e ||< CiV (5.3) 

with a known bound dependent on S. The approximating weights may be unknown, 
but the NN approximation property guarantees that they exist. Unlike adaptive control, 
no LIP assumption is required and the property (5.2) holds for all smooth functions /(.). 


5.2 Robot Arm Dynamics and Tracking Error Dynamics 

The dynamics of rigid-link robot arms have the form : 

M iq)q + Vmiq, q)q + Fiq) + G{q) + = r . (5.4) 
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where M{q) is the inertia matrix, Vm{q,q) is the Coriolis/centripetal matrix, F[q) are 
the friction terms, G(q) is the gravity vector, and represents disturbances. These 
dynamics may also include actuators. Thus, the control input T{t) can represent torques 
or motor currents, etc. The rigid dynamics have following properties: 

1. Property 5.1 (Boundedness of Inertia Matrix) ; The inertia matrix M (g) is sym- 
metric, positive definite, and bounded so that pil < M (q) < 112 I 

2. The Coriolis/centripetal vector Vm{q, q)q is quadratic in g. V^n is bounded so that 
II Vm ||< VB II q II, or equivalently || Vmq ||< vb || q ||^. 

3. Property 5.2 (Skew Symmetry) .• The Coriolis/centripetal matrix can always be 
selected so that the matrix S (g, q) = M (g) — 2Vm (g, g) is skew symmetric. There- 
fore, x^Sx = Ofor all vectors x, 

4. The gravity vector is bounded so that || G(g) ||< 

5. The disturbances are bounded so that || ||< 

The objective in this chapter is to make the robot manipulator follow a prescribed trajec- 
tory ga(t). Define the tracking error e(t) and filtered tracking error r(f) by 

e = (Id-^ (5-5) 

r = e + Ae (5.6) 

with A > 0, a positive definite design parameter matrix. Since (5.6) is a stable system, 
it follows that e(t) is bounded as long as the controller guarantees that the filtered error 
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r(i) is bounded. 


Assumption 5.1 (Bounded Reference Trajectory) The desired trajectory is bounded so 
that 


Qdit) 


Qd{t) 


< QB 


Qd{t) 


(5.7) 


with QB a known scalar bound. 


The robot dynamics can be expressed in terms of the filtered error as 

Mr = -VraT + /(x) + Td - r (5.8) 

where the unknown nonlinear robot function is defined as 

/(x) = M{q)iqd + Ae) + Vm{q,q)iqd + Ae) + F{q) + G{q) (5.9) 

One may define, for instance, 


X = [e^ qj qj qjf 

A general sort of approximation-based controller is derived by setting 

r = f + K„r-v{t) - (5.11) 
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Figure 5.1: Filtered error approximation-based controller 


with / an estimate of f{x), K^r = K^e + if^Ae an outer PD tracking loop, and v{t) 
an auxiliary signal to provide robustness in the face of disturbances and modelling errors. 
The multiloop control stmcture implied by this scheme is shown in Figure 5.1. The 
estimate / is obtained using NN as shown in the figure. Using this controller, the closed- 
loop error dynamics are 

Mr = -Vmr-Kyr + f + Td + v{t) (5.12) 

where the function approximation error is given by 

/ = /-/ ( 5 - 13 ) 


The following lemma is required in subsequent work. 
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Lemma 5.1 (Bound on NN Input x) For each time t, x{t) is bounded by 


ll^il < Cl +C 2 llr|| < te + col|r(0)l| +C 2 ||r|i (5.14) 

for computable positive constants cq, ci, C 2 . 

The proof is available in [26]. Another assumption which is required for these kinds of 
NN controllers is given below. 

Assumption 5.2 (Initial condition requirement) Let the NN approximation property ( 5.2 ) 
hold for the function f{x) given in (5.9) with a given accuracy e n for all x inside the ball 
of radius bx > qs- Let the initial tracking error satisfy l|r(0)|| < {bx — qB)/ioo + 02 ). 

This set specifies the set of allowed initial tracking errors r (0). The constants cq, C 2 , 6^ 
need not be explicitly determined. In practical situations, the initial condition require- 
ment merely indicates that the NN should be ’large enough’ in terms of the number L of 
hidden-layer units. 

5.3 Robust Backstopping Control using Neural Networks 

5.3.1 System Description 

Robust control of nonlinear systems with uncertainties is of prime importance in many 
industrial applications. The model of many practical nonlinear systems can be expressed 
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in a special state-space fonn 


= -FiCxi) -I- Gi(xi)a;2 
±2 = F2{xi,X2) + G2{xi,X2)X3 
^3 = , Fz{xi,X2,X3) + Gzixi,X2,X2)x4, 


— Fm(xi,X2, . . . ,X.m) + Gm(xi,X2, ■ ■ ■ ,Xfri)u (5-15) 

where Xi G J?”, i = 1, 2, . . . , m denote the states of the system, u E ii” is the vector 
of control inputs. Fi, Gi E i = 1, 2, . . . , m are nonlinear functions that contain 
both parametric and nonparametric uncertainties, and Gi'z are known and invertible. This 
requirement that Gj’s be invertible and known may be stringent. 

The equation (5.15) is known as strict-feedback form [23]. The reason for this name 
is that the nonlinearities Fj, Gi depend only on xi, X 2 , . - . , that is, on state variables 
that are "fed back". 

Stability of Systems [26] 

Consider the following nonlinear system 

X = f{x, t), y = h{x, t) (5.16) 

with state x(t) E BF. We say the solution is uniformly ultimately bounded (UUB) if there 
exists a compact set U C RF such that for all x(to) — xo E U, there exists an e > 0 and 
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a number T{e, xq) such that |lx(t)l| < € for all t>to + T. 


5.3.2 Traditional Backstepping Design 

The backstepping design [23, 25] can be applied to the class of nonlinear systems (5.15) 
as long as the internal dynamics are stabilizable. In this method, first select a desirable 
value of X 2 , possibly a function of xi, denoted X2d, such that in the ideal system xi = 
F 1 (xi , X2d), one has stable tracking by xi (t) of xi^. Then in second step, select X3 to be 
x^d so that X2 tracks X 2 d and this process is repeated. Finally, select u{t) such that Xm 
tracks Xmd- 

A number of robust and adaptive procedures exist which implement the above back- 
stepping method. The above backstepping procedures becomes more complicated when 
there exist parametric uncertainties in the systems. The complications are due to the 
following problems with the existing robust and adaptive procedures: 

1. ''regression matrices'' in each step of the backstepping design must be determined. 
The computation of regression matrices for robot manipulators is very tedious and 
time consuming. 

2. one basic assumption, that the unknown system parameters must satisfy the so- 
called " linearity 4n~parameter" , is quite restrictive and may not be true in many 
practical situations. 

By using Neural Network, one can alleviate the disadvantages of the tedious and lengthy 
process of determining and computing regression matrices while retaining the merit of 


61 



systematic design in the backstepping control. As no LIP assumption is made, this design 
can be applied to a broader class of nonlinear systems. 

5.3.3 Robust Backstepping controller design using NN 

NN Basics 

Let R denote the real numbers, J?” the real n-vectors, the real mxn matrices. Let 
5 be a compact simply connected set of R”. With map / : 5 R^, define C'’"(5') the 

functional space such that / is continuous. Define as the collection of NN weights. 
Then the net output is 

y = W^(j){x) (5.17) 

A general nonlinear function f{x) € C'^{S), x{t) G S can be approximated by an NN 
as 

f{x) = W^(j){x) + £(a:) (5.18) 

with e{x) a NN functional reconstruction error vector. The stmcture of a 2-layer NN is 
shown in Figure 5.2. For suitable NN approximation properties, (j){x) must satisfy some 
conditions. 

Definition 5.1 Let S be a compact simply connected set of if*, and (j>{x) : S —* R^^ be 
intergrable and bounded. Then <j>{x) is said to provide a basis for C^{S) if 

1. A constant function on S can be expressed as (5.18) for finite N 2 . 

2. The functional range ofNN (5.18) is dense in C’”(5) for countable N^. 
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Figure 5.2: Two layer Neural Network 


Here, <p{x) is chosen to be radial basis functions, as RBFs provide a universal basis for 
all smooth nonlinear functions. 


Controller structure 

Step 1 - Design fictitious controllers for X 2 ,X 3 , . . ., and Xm- First of all, we design the 
fictitious controller for X2d- Recalling that 

xi = Fi{xi) + Gi{xi)x2 (5.19) 

Choosing the following fictitous controller 


X2d = + iu - 


(5.20) 
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where Ki > 0 is a design parameter, Fi is the estimate of Fi. Substituting ( 5 . 20 ) into 
subsystem ( 5 . 19 ) yields the error dynamics 

h = Fi-Fi-Kiei + Gie2 ( 5 . 21 ) 

with 62 — ^2 ~ ^2d- The usual adaptive backstepping approach is to assume that the 
unknown parameters in are linearly parametrizable so that standard adaptive control 
can be used. Use of NN to approximate F\ obviates this restriction. The next step of 
backstepping is to make the error X2 — X2d as small as possible. Differentiating 62 defined 
in (5.21) gives 

62 =±2- X2d = F2 + G 2 XS - X2d ( 5 . 22 ) 

A fictitious controller for xz of the form 


^ 3 d = ^2 ^(—.^2 + X2d ~ ^2^2 ~ ^l^l) ( 5 . 23 ) 

can be chosen. Note that there is a coupling term Giez in ( 5 . 21 ). The purpose of the 
term Gfei is to compensate the effect of coupling due to (?ie2. Substituting the fictitious 
controller ( 5 . 23 ) into ( 5 . 22 ) gives 

62 — F2 — F2 — K262 — G161 + G26Z ( 5 - 24 ) 

with 63 = Xz — xzd, K2 > 0 a. design parameter and F2 the estimate of ^2- In a 
similar fashion, we can design a fictitious controller for Xm to make the error 6m— i — 
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Figure 5.3: Backstepping NN control of nonlinear systems in “strict-feedback” form 


Xm-i - as Small as possible, i.e„ 

^md — ~ “ ^?m-2®rn-2) (5.25) 

The dynamics of Cm-i = Xm-i - a:(m-i)d is then governed by 

Cm-l = Fm-1 — Fm-1 “ Fm-iem-l " Gm-2^m.-2 + Gm-lCm (5.26) 

with Cm = Xm- Xmd, Km-i > 0 a design parameter and Fm-i the estimate of 
Fm-x- Here, NNs are used to approximate the complicated nonlinear functions , Fj’s, 
i = 1, 2, . . . , m. As a result, no regression matrices are needed and the controller is 
re-usable for different systems within the same class of nonlinear systems. 

Step 2 - Design of Actual control u: Differentiating em = Xm — Xmd defined in (5.26) 
yields 

Cm = Xm- ^md ^ Fm + Gm'd' “ Xmd (5.27) 
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Choosing the controller of the form 


w {-Fm + - K^e,n - G^-ie^-i) (5.28) 

gives the following dynamics for error e^, 

6m = Fm — Fm, — K-^era ~ Gm-l®m-l (5.29) 

with Km > 0 a design parameter and Fm the estimate of Fm.- The overall control structure 
is shown in Figure 5.3. 


Bounding assumptions, Error dynamics and Weight tuning algorithm 

Assume that the nonlinear functions Fj’s, i = 1,2, ... ,m in equations (5.21), (5.24), 
(5.26), and (5.29) can be represented by m 2-layer neural nets for some constant “ideal” 
weights Wi, i = 1, 2, . . . , m, i.e., 

Fi = -I- €i, l|€i|| < em = constant (5.30) 

for i = 1,2, ... ,m, where <f>i’s provide suitable basis functions for m NNs. The net 
reconstruction errors ei’s are bounded by known constants en^, i = 1, 2, . . . , m. Define 
the NN functional estimate of Fi in (5.30) by 

Fi = Wi^U i = (5.31) 
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with Wi the current NN weight estimates provided by the tuning algorithm. Then the 
en-or dynamics (5.21), (5.24), (5.26), (5.29) become 

ei = Wi -,Kiei + 0 x 62 + Cl 

62 = W2 4>2 — K 2 e 2 — Gfei + 0263 + €2 

63 = — K^ez — G 2 62 + G 364 + 63 

6 m — ~ — G^_iem -1 + 6 m (5.32) 

Define ( = [ej' ei’ ... 6 ^;]^, e = [ej ei’ e^f, Z = diag{Wx,W 2 ,...,Wra}, 
r = diag{ri^T2,- ■ ■ ,Tm}, K = diag{Ki,K2,....,Km}, ^ = ^2 

and 

0 Gi 0 ... 0 

-Gf 0 G 2 ... 0 

0 -Gi’ 0 

• • ■ • ■ • Gm— 1 

0 0 ... -Gl_x 0 

The error dynamics (5.32) can be expressed in terms of the above quantities as 

C = -KC + Z'^(f> + H(: + € (5.33) 

Note that the term H( denotes the couplings between the error dynamics (5.33). The ma- 
trix H is skew-synunetric. Along with (5.1), another assumption which is quite common 
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in the neural network literature is stated next. 


Assumption 5.3 The ideal weights are bounded by known positive values so that 

Wi\\F<WiM, z = l,2,...,m (5.34) 

or equivalently, ||2^|| < where is known. The symbol H.ljp denotes the Frobenius 
norm, 

Kwan et al. [25, 39] proposed a robust weight tuning algorithm for the 2-layer NN which 
is given below. 

Theorem 5.1 (Weight tuning algorithm) Suppose Assumptions (5,1) and (5.3) are sat- 
isfied. Take the control input (5.28) with NN weight tuning be provide by 

= -ku^TiUWWi, i = l,2,...,m (5.35) 

with constant matrices Fj = > 0, t = 1, 2, . . . , m, and scalar positive constant k^. 

Then the errors ei{t),i = 1,2, ... ,m, and NN weights are UUB. 

The errors ei(t), i = 1, 2, . . . , m, can be kept as small as possible by increasing gains K 
in (5.33). The proof has already been discussed in the last chapter. This tuning algorithm 
can be implemented online and does not require any offline training. 
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5.4 Singular Perturbation Design 


5.4.1 Introduction 

The method of singular perturbations [26] recognizes the fact that a large class of systems 
have slow dynamics and fast dynannics which operate on a very different time-scales and 
are essentially independent. The control input can therefore be decomposed into a fast 
portion and a slow portion, which has the effect of doubling the number of control inputs. 
This is accomplished by a tune-scale separation that allows one to split the dynamics into 
a slow subsystem and Sifast subsystem. 


5.4.2 Singular Perturbations for Nonlinear Systems 

A large class of nonlinear systems can be described by the equations 


Xi = fl(x,u) 

(5.36) 

eX2 = hi{pi)+ h2{^i)^‘i + 92{x\)u 

(5.37) 


where the state x = decomposed into two portions. The left-hand side of 

the second equation is premultiplied by e « 1, indicating that the dynamics of X 2 are 
much faster than those of xi. In fact, the variable xi develops on the standard time-scale 
t, while = 6X2 with 

r = - (5-38) 

£ 
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so that the natural time-scale for X 2 is a faster one defined by T. The control variable u{t) 
is required to manipulate both xj and X 2 . Using the technique of singular perturbations, 
its effectiveness can be increased under certain conditions as explained next. This allows 
simplified control design. 


Slow/Fast Subsystem Decomposition 

The system may be decomposed into a slow subsystem and a fast subsystem to increase 
the control effectiveness. To accomplish this, define all variables to consist of a slow part, 
denoted by overbar, and a fast part, denoted by tilde. Thus, 

xi=xi-l-ii, X 2 = X 2 -I-X 2 , u = u-\-u (5.39) 

To derive the slow dynamics, set £ = 0 and replace all variables by their slow portions. 
From (5.37) one obtains 


0 = /2l(Xl) + f22{xi)X2 + ?2(xi)u 


which may be solved to obtain 


X2 = -/2"2n^l)[/2l(^l) + 52 (xi)u] (5.40) 


under the condtion that /22 (xi) is invertible. This is known as the slow manifold equation. 


and reveals that the slow portion of X 2 is dependent completely on the slow portion of xi 
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and the slow portion of the control input. Now one obtains from (5.36) the slow subsystem 

equation 

= fi{xi,X2,u) (5.41) 

with $2 given by the slow manifold equation. One can notice that this equation is of 
reduced order. To study the fast dynamics, one works in the time scale T, assuming that 
the slow variables vary negligibly in this time scale. From (5.36) one has 

did 

^ d , d _ 

+ Xi) — £^/i(x, u) ^ 0 

Now, one may write from (5.37) 

£-^X2 = ^(X2+X2) =/2l(^l) + /22(5l)($2+52)+52(Xl)(u + 'u), 

whence substitution from (5.40) and the assumption that = 0 yields 

d, 

-J^X2 = f22{xi)X2 + 52(xi)u (5.42) 

al 

This is ^fast subsystem, which is linear since xi is constant in the T time scale. 

Composite controls Design and Tikhonov’s Theorem 

The singular perturbation slow/fast decomposition suggests the following control design 
technique. Design a slow control u for the slow subsystem (5.41) using any method. 
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Design a fast control u for the fast subsystem (5.42) using any method. Then, apply the 
composite control 

u = u + u (5.43) 

I 

Using this technique, one performs two control designs, and effectively obtains two inde- 
pendent control input components that are simply added to produce u{t). This indepen- 
dent control design of two control inputs practically increases the control effectiveness. 
The composite control approach works due to an extension of Tikhonovas Theorem, which 
also relates the slow/fast decomposition (5.41)-(5.42) to the original system description 
(5.36)-(5.37). It states that if f 22 (^ 1 ) is invertible and the linear system (5.42) is stabiliz- 
able considering xi as slowly time-varying (i.e. practically constant), then one has 


xi = + 0{e) 

(5.44) 

X2 = X 2 + X 2 + 0{s) 

(5.45) 


where 0{e) denotes terms of order e. From these notions one obtains the concept of 5 2 as 
a boundary-layer correction term due to the fast dynamics, and of ix as a boundary-layer 
correction control term that manages the high-frequency (fast) motion. 
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5.5 Applications 


5.5.1 Rigid Link Electrically Driven Robot Manipulator 

For simplicity, we assume the actuator is a permanent magnet dc motor. The model for 
an n-link RLED robot is given by 

M{q)q + Vm{q, q)q + G{q) + F{q) +Tl= KtI (5.46) 

Li + RiI,q) + TE=^UE (5.47) 

with q, q, q G denoting link position, velocity, and acceleration vectors, respectively, 
M(q) G the inertia matrix, Vmiq, q) S i?" the Centripetal-Coriolis matrix, G{q) G 
i?” the gravity vector, F{q) G BF representing the friction terms, Ti G FF the additive 
bounded disturbance, I G FF the armature current, Kt G RF'^'^ the positive definite 

I 

constant diagonal matrix which characterizes the electro-mechanical conversion between 
current and torque, L € a positive definite constant diagonal matrix denoting the 

electrical inductance, jR(/, q) € representing the electrical resistance and the motor 
back-electromotive force, ue ^ the control vector representing the motor terminal 
voltages, and Te G RP' representing an additive bounded voltage disturbance. 

The rigid dynamics properties are enumerated in the section 5.2. The torque trans- 
mission matrix Kt is assumed to be bounded by 

< x^Ktx < A: 2 l|a:|p, for arbitrary vectors (5.48) 
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where ki and k 2 are positive scalar bounding constants. The inductance matrix L is also 
assumed to be bounded by 

h ll^lP < x^Lx < arbitrary vector x (5.49) 

where h and I 2 are positive scalar bounding constants. 


5.5.2 Control Objective and Central Ideas of NN RLED Controller Design 

The control objective is to develop a link position tracking controller [39] for the RLED 
robot dynamics given by (5.46) based on inexact knowledge of manipulator dynamics. 
Following the discussion similar to that in section 5.2, the robot dynamics can be ex- 
pressed in terms of filtered error as 

Mr = Fi-Vmr + TL-KTl . (5.50) 

where A G is a positive definite control gain and the complicated nonlinear function 

JFi is defined as 

Fi = M{q){qd + Ae) + Vm{q^ Q){Qd + Ae) + G{q) + F{q) (5.51) 
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Design Steps 


Step 1: Treat current J as a fictitious control signal to the error dynamics v 
Then (5.50) can be rewritten as 


Mr = Fi- Vmr + Tl- Kpld + KtV (5.52) 


where 77 = — J is an error signal which is to be minimized in the second step. The 

control objective of the first step is to design an NN controller for Id to make r as small 
as possible. The fictitious controller can be selected as 


Id = -rlh + krr + Ur] 

ki 


(5.53) 


where Fi = Wi<pi, kr is a positive scalar, i/r is a robustifying term to be defined shortly, 
and ki is defined in (5.48). Substituting (5.53) into (5.52) gives 


Mr = -Vrr^T+Wi^i-kr-r^r'^ei-hTL 

ki 

As one can see that there is an unknown term (j — (j>i in (5.54) because Kt 

is usually unknown. The role of the robustifying term Ur is to suppress the effect of this 
signal. The form of Ur is chosen to be ' 


Ur = prsgnir) 


(5.55) 
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where 


Pr = \\Wl <t>i\\h (5.56) 

and 6fe in (5.56) stand for the upper bound of 11(1 — ^)11. 

Step 2: The second step is to design a second NN controller for u e such that the error 
signal rj is as small as possible. Differentiating p = 1^ — L using (5.47) and multiplying 
L on both sides of the resulting expression yields 

Lr} F 2 -hTs — ue (5.57) 

where F 2 is very complicated nonlinear function of g, g, r. Id, and J. The control signal 
Ue can be chosen as 

Ue ^ F 2 + kiyT} (5.58) 

where F 2 = W 2 (f >2 and ki^ > 0. Inserting (5.58) into (5.57) gives 

Lfi = <P2 + £ 2 + Tb- kyT} (5.59) 

The controller structure is shown in Fig. 5.4. 

Step 3: Weight update algorithm and stability analysis 

The weight tuning algorithm as stated in theorem 5.1 is used here. It is shown that all 
errors such as weight updates and tracking errors are UUB. The weight tuning algorithm 
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Figure 5.4; NN controller structure for RLED 


for two networks is restated below for convenience. 


Wi 

= Fi</iir^-fc^Fi||CliWi 

(5.60) 

^2 

= T2hr'^ - kMlCm 

(5.61) 


with any constant matrices Fi = > 0, F 2 = F 2 > 0, C = scalar 

positive constant k^. 


Simulation 

The model for RLED is described in the form (5.46) with 


M{q) = 


a + bcos{q 2 ) c + |cos(g 2 ) 
c+^cos{q 2 ) c 
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-6sm(92)(gig2 + 0.5g|) 

Vmq = 

0.5bsin{q2)qi 
dcos(qi) + ecos{q 2 ') 

G{q) = 

ecos{qi + 

a = 127712 + li{mi + m 2 ), b = 2lil2m2, 
c — l 2 m 2 , d = (mi + m 2 )ligo, e = 

The parameter values are li = Im, I 2 = Im, mi = 0.8Kg,m2 = 2.3Kg, and go = 
9.8m/ The actuator dynamics are assumed to be the permanent magnet direct-current 
motor (5.47). The parameters of motor are Rj = Ifi, Lj = O-OliT, Kj = 2.0Nm/A, 
j = 1,2. The desired joint trajectories are defined as qdi{t) = sin{t), qd 2 {t) = cos{t). 
The inputs to the NNs are given by 

^ = [Cf Cj cos{q)'^ sin{q)'^ F if 

where Ci = + Ae and C 2 = 4d + Figure 5.5 shows the simulation results 

with PD control with K = diag{100, 100} and Kd = diag{20, 20}. Figure 5.6 shows 
the simulation results for NN back-stepping control with kr = diag{Q0, 60} and = 
diag{1.5, 1.5}. 

Remarks: 

• The problem of weight initialization occuring in other approaches in the literature 
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NN Backstepping control of 2-Link RLED 


NN Backstepping control of 2 - Link RLED 



(a) Link 1 Position Tracking (b) Link 2 Position Tracking 

NN Backstepping control of 2 - Link RLED 



(c) Control Torques 

Figure 5.6; NN Back Stepping control of 2-link RLED 
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does not arise, since weights ^^(O) are taken as zeros. 

• The tuning algorithm guarantees the boundedness of weights Wi's. 

• In PD control one can get better tracking performance and thus smaller tracking 
error by increasing the gains. However, it is a well known fact that high-gain feed- 
back may not be desirable since it may excite some high-frequency unmodeled 
dynamics. 

• The NN controller can indeed improve the tracking performance without resorting 
to high-gain feedback. 

5.5.3 Flexible Link Manipulator 

Flexible-link robotic systems [26] comprise an important class of systems that include 
lightweight arms for assembly, civil infrastructure bridge/vehicle systems, military tank 
gun barrel applications, and large-scale space structures. Their key feature is the pres- 
ence of vibratory modes that tend to disturb or even destabilize the motion. They are 
described as infinite-dimensional dynamical systems using partial differential equations 
(PDF); some assumptions make them tractable by allowing one to describe them using or- 
dinary differential equation (ODE), which is finite-dimensional. The dynamics of flexible 
link manipulator can be described as 

M{q)q + Vm{qA)q + Kq + F{q) + G{q) + Td = B{q)r (5.62) 
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where q{t) G consists of all rigid and flexible mode variables. One may partition the 
dynamics (5.62) according to the rigid and flexible modes as 


Mrr 

Mrf 


Qr 

+ 

Vrr 

Vrf 


Qr 

Mfr 

Mff 


. 


1 

1 




0 

0 


(Jr 


Fr 


Gr 


Br 





+ 


+ 


= 


0 

I 


. . 


0 


0 




(5.63) 


The major difference between (5.62) and rigid robot equation (5.46) is that the input 
matrix B{q) has more rows than columns, so that the flexible-link arms have reduced 
control effectiveness and the techniques used for rigid robots cannot be directly applied. 
An ameliorating factor is that, like the rigid-link case, flexible-link dynamics satisfy the 
properties enumerated in section 5.2. In equation (5.62) one can notice that gravity and 
friction only act on the rigid modes, while the flexibility effects in K only effect the 
flexible modes. The control matrix Br is nonsingular. The equation (5.63) may also be 
written as 


Mrr 'qr + ^rfQf + ^rrQr + ^rfQf "b R'riAr) + Gr(9r) 
Mfrqr + Mffqf + Vfrqr + Vffqf -f Kffqf 


BfT 

BfT (5.64) 
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To write a state equation, define 



Hrf 


Mrr 

Mj-f 

Hfr 

Hff _ 



Mff ^ 


Multiply (5.63) by (5.65) from the left, rearrange the terms, and write 


(5.65) 


qr = -V^\qr-V,)qf-K^fqf-F)-Gl + B^T (5.66) 

qf = -V},qr-V^fqf-K}fqf-F}-G} + B}T (5.67) 


with: 


Vj, = HrrVrr + HrfVfr, = HfrVrr + HffVfr, 

V,) = HrrVrf + HrfVff, = HfrVrf + HjfVff, 

K}f = HrfKff, K}f = HffKff, 


F^ — HrrFr, 
Gj = HrrGr, 
B^ = Hr-rBr + HrfBf, 


F} = HfrFr, 
G) = HfrGr. 


B^ = HfrBr + HffBf. 


Now, equations (5.66)-(5.67) can be placed into state-space form x = /(a;, u) by defining 
the state, for instance, as a; = [q q]'^ = [Qt Qf Qr • 


Control Difficulties : 
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The objective is basically to force the rigid mode variable E to follow desired 
trajectories. The control input available is r G since there is one actuator per link. 
However, the extra state qf £ RP^ introduces Uf additional vibratory degrees of freedom 
that require control to suppress the vibrations. Therefore, the number of inputs is less than 
the number of degrees of freedom. Moreover, it turns out that by selecting the control in- 
put r(t) to achieve practical tracking performance of rigid variable qr{t)y one actually 
destabilizes the flexible modes This is due to the non-minimum phase nature of 

zero dynamics of flexible-link robot arms. Here, singular perturbation technique [26] is 
used to solve this control problem for a one-link robot arm with two retained flexible 
modes. 

Open-Loop Behaviour of Flexible-link Robot Arm 

The model of one-link robot arm with pinned-pinned boundary conditions and two re- 
tained modes is given by (5.62) with 


2.2024 

0.0517 

0.0410 


0.0200 

0.0013 

0.0027 , 

0.0517 

0.0026 

0.0036 

, Vn = 

0.0013 

0.0001 

0.0002 

0.0410 

0.0036 

0.0080 


0.0027 

0.0002 

0.0004 


0 

0 

0 


1.0000 

0 

14.0733 

0 

, G = 0, B = 

0.0668 

0 

0 

225.1734 


0.0668 
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The open-loop poles are found to be 


$ = 


0 

- 0.01 

-0.0002 ± 89.97i 


y -0.0000 ± 342.01Z j 


There are two poles near 5 = 0 corresponding to rigid joint angle and two complex 
pole pair, almost completely undamped, one corresponding to first mode with frequency 
14.32 Hz and one corresponding to second mode with frequency 54.43 Hz. The open- 
loop response of flexible arm is shown in Fig. 5.7. The torque profile chosen for the 
open-loop response is also shown in the same figure. 


NN based singular perturbation design 

Introduce a small scale factor e and define ^ and K// by 

(5.68) 

where 1/ is equal to the smallest stiffness in Kfj. Substitution of (5.68) into (5.66) and 
(5.67) gives the system in the form, 

qr = -V,\qr - V,yi - il/e'^)Hrfkffe^^ - -Gl + B^r (5.69) 

= -Vfrqr - VfYi - - Fj -G] + Bjr (5.70) 
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where Kff is invertible because it is diagonal. Define now the control 


T = T + Tf 


(5.71) 


with r the slow control component and rp a fast component. Setting £ = 0 yields 


qr = -KUr - HrfiCffi - + B^f (5.72) 

I = Fj -G} + Bjf) (5.73) 


(5.72) is the slow dynamic equation and (5.73) is the slow manifold equation. Substituting 

(5.73) into (5.72), one can obtain the slow subsystem given by 


qr = {Hrr-HrfHj}Hfr)[-Vrrqr-Fr-Gr + Brf] 

= M-1 [-Vrrqr - Fr - Gr + Brf] (5.74) 


To derive the fast subsystem, select states Ci = ^ ~ C2 = 3Dd (5.70) can be written 
as 


sCi = C2 

eCa = -Vjr^r - V^feC2 - SffKff{Ci + q) - Fj - Gj + Bjr (5.75) 
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Now making a time-scale change of T = f/e and observing that d^/dT « 0 and £ w 0, 
the fast dynamics is given by 


dT 

dT 


C2 

-HffKffCi + BfTF 


(5.76) 


It is important to note that this is a linear system, with the slow variables as parameters. 
A number of control techniques for this linear system can be designed. The fast control 
can be chosen as 


TF = —[KpF K^f] 


Cl 

C2 


Kpf KiF 


if + KpfI 


(5.77) 


with ^ given by (5.73). In terms of filtered tracking error, the slow subsystem (5.74) can 
be rewritten as 

Mrj.r = —Vrrr — Brf + f (5.78) 

where the unknown nonlinear robot function is 


f{X) = MrriQd + + Vrriid + Ae) -f Fj. -f- Gr (5.79) 


and one may select 

X = [eeqd id id]'^ 
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Stable NN controller: A control torque for the slow subsystem is defined as 

f = B-\f + K^r - v] (5.80) 

with f{X) an estimate of f{x), a gain matrix > 0 and a function v{t) that 

provides robustness. The closed-loop system for rigid dynamics 'can now be written as 

= -{Ky -h Vrr)T + / + (5.81) 

where functional estimation error is given by / = / — /. A 2-layer NN is used to estimate 
the nonlinear function /(x). The activation function of neurons is sigmoidal. The weights 
for first layer is V and for second layer it is W, The weight tuning algorithm [26] is given 
as 


W = Faiy'^Xy'^ -Fa'V'^Xr^ - KF\\r\\W 
V = GXip^Wrf - KG\\r\\V (5.82) 

The robustifying signal is given by 


v{t) = -K,{\\Z\\ + ZbV (5.83) 

where Z = diag{W, V} and ||Z||ir < Zb- The NN control structure is shown in Figure 
5.8. The response of flexible arm for e = 0.05 is shown in Fig. 5.9. The response of 
flexible arm for sinusoidal trajectory tracking is shown in Fig. 5.10. 
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Figure 5.8: Neural Net controller for Flexible-Link robot arm 


Remarks: 

1. SPD allows splitting up a system into subsystems for which control can be designed 
independently from each other. 

2. As can be seen from figures. Neural Network does improve the tracking by sup- 
pressing vibrations due to flexible modes. 

3. Neural Networks help in approximating unmodelled dynamics and uncertain pa- 
rameters inherent in the system. 

5.5.4 Rigid-Link Flexible-Joint Manipulator 

The model for an n-link RLFJ robot [25] is given by 

Miq)q + Vm{q, q)q + G{q) + F{q) +Tl + K{q - g/) = 0 (5.84) 

Jqm + Bqm-K{q-qm)=u + TB (5.85) 
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Posidon(radian) and Velocity (radian/sec) 


0.2 


Flexibk Link ManipuUlor 


Fkxible modes 



(a) position and velocity tracking 
by rigid link 



Hexible link Manipulator control 



Figure 5.9: Closed loop response of flexible link manipulator with NN based SPD control 


91 





Flexible modes 


Flexible Uni: manipuktor (SPD) 


Flexible Imk manipulator (SPD) 




Flexible link Manipulator (SPD) 




Figure 5.10: Closed loop response of flexible link manipulator for sinusoidal trajectory 
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with g, 4, q S i?” denoting the link position, velocity and acceleration vectors, respec- 
tively, M{q) G the inertia matrix, Vm{q-, q) € R"', the centripetal-coriolis matrix, 

G(q) G i?”, the gravity vector, F(q) G J?” representing friction terms, Tl € the 
additive disturbance, qm, qm, qm 6 -R”, the motor shaft angle, velocity, acceleration, re- 
spectively. itr G the positive definite constant diagonal matrix which characterizes 
joint flexibility, J G R"^" a positive definite constant diagonal matrix denoting motor 
inertia, B G represents natural damping term, u the control vector used to repre- 
sent the motor torque, and Te S R" representing an additive bounded torque disturbance. 
The rigid dynamics (5.84) has two important properties (5.1) and (5.2), stated in section 
5.2. The joint elasticity matrix K is bounded by 

< x^Kx < A: 2 ||x|P, for arbitrary vectorx 

where ki and k 2 are positive scalar bounding constants. The motor inertia matrix J is 
also bounded by 


< x^Jx < IzIblP, for arbitrary vector x 

where Zj and Z 2 are positive scalar bounding constants. 

Control Objective: The control objective for the flexible-joint robot arm is to control 
the arm joint variable q{t) in the face of the additional dynamics of the variable qmii), 
the motor angle. This is similar to the situation for the flexible-link arm (5.62), where 
one must control the arm in the face of the additional dynamics of q/(t), the flexible 


93 




Figure 5.11: Open-loop response of fast variable q — Qm 

modes. However, control design for these two problems must be approached by different 
philosophies. The flexible-link arm is fundamentally a disturbance rejection problem, 
while for flexible-joint arm one is faced with manipulating an intermediate variable qm{t) 
that has subsequent influence on the variable of interest q{t). 


Singular Perturbation Design for 2-Link RLFJ manipulator 

Before one starts with singular perturbation design, one must see whether it is possible to 
split the system into two independent subsystems namely fast and slow subsystems. Since 
the dynamics of robot arm and the motor are in almost same time scale, we choose q—qm 
as our fast variable. The open-loop response of q qm for K = 100 and K = 1000 is 
shown in the Fig. 5.1 1. As one can see, for higher value of AT, the freqency of fast variable 
increases and thus one has a better chance of splitting the given system into fast/slow 
subsystems. Lets choose = q — q-m and K = We can write (5.84)-(5.85) as 
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follows: 


0 = Mq + Vmq + G + F + k (5.86) 

= Jq + Bq-Bs^i-k-u (5.87) 

The slow dynamics can be obtained by substituting £ = 0 in above equations. The slow 
manifold equation is obtained from (5.87) as 

l = k-'^{Jq + Bq-u) (5.88) 

where f, u and q represent slow variables. Substituting ^ from (5.88) and e = 0 into 
(5.86), one can write the slow dynamics as 

(M + J)q + (Vm 4* B)q) + G + F u 

Mq+Vq + G + F--u (5.89) 

where M = M + J and V = Vm + B. The slow subsystem is of reduced order. By 
defining two state variables Ci = ^ ^ and C 2 = the state equations for fast subsystem 
may be written as 

eC = C2 

£<^2 = q+J-'^Bq-J-^BeH-J-'^k^-J-'^u (5.90) 
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Substituting u = u + u, ^ = + ^ and ^ from (5.88) into (5.90), we get 


<2 = -J~^Be<:2 - - J-^u (5.91) 


Now, changing the time scale T = t/e and assuming that e w 0, we get the fast subsystem 


dT 

dT 


= C2 


(5.92) 


This is a linear system which can be stabilized using any of the linear system techniques 
like LQR or state-feedback. The control input for the fast system is given as 

^ = ^iq-Qm) + ^iq-qm)-Kpfi (5.93) 

The slow subsystem is a nonlinear system which is now controlled as follows: 

Define e = gd — g, r = e4' Ae. The slow subsystem (5.89) can be written in terms of 
filtered error as, 


Mr = M(gd 4- Ae) + Vq-{-G-{-F-u 

= M {qd + Ae) -f V {q^ -f Ae) + — Vr — u{qd + Ae) + V {q^, 4- Ae) + C? + jP 

= Fi--Vr-u (5.94) 
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where 


Fi = M{qd 4- Ae) + V{qd + Ae) + G +-F 

Lets choose a control 

iZ = + A - ^ (5.95) 

where A is the approximation of Fi obtained from a 2-layer NN. u is the robustifying 
term (5.83). The closed-loop response of the slow system is given as follows: 

1 

Mr = (Fi - A) - (t^ + K^)r + i/ (5.96) 

The weight update algorithm (5.82) ensures the stability of above closed loop system. 
Thus we obtain two independent control inputs u from (5.95) and u from (5.93) and the 
final control input is given as u = tt + tZ. The simulation results for K = 1000 are shown 
in Fig. 5.12. Here Kpf = diag{0.Z7, 0.37} and K^f = diag{0m, 0.69}. 

Remarks: 

• Lower value of stiffness coefficient matrix K necessitates high control input. So, 
the stiffness matrix should be very high, the reason being the fact that for highly 
flexible system, we can’t split the system into fast/slow subsystem. 

• The gain matrices in (5.93) Kpf and K^f are very small. This method does not 
require high feedback gains which is certainly desirable. 

• NN approximation obviates the necessity for exact computation of nonlinear and 
uncertain terms thereby simplifying the control problem. 
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Figure 5.12: Simulation Results of NN based SPD control of 2-Link Flexible-Joint Ma- 
nipulator 
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5.6 Summary 


In the previous chapters, we studied how Neural Networks can be used for identifying 
nonlinear systems. In this chapter, we study various neural network based controllers 
for nonlinear as well as underactuated systems. For fully actuated systems, we have 
many control techniques namely, feedback linearization, backstepping and other robust 
and adaptive control techniques etc. The control strategies become more complex when 
the number of actuators become less than the degrees of freedom. The problem with these 
techniques is that they require certain simplifying assumptions which limit their applica- 
bility and in many practical situations, these assumptions don’t hold good. Moreover, the 
computation of regression matrices in adaptive controllers is very formidable. We saw 
that the use of Neural Network circumvents these problems and don’t require any such 
restricting assumptions. Thus, NN based controllers can be applied to a wider class of 
problems. Two NN based controllers namely, NN-Backstepping and NN-singular pertur- 
bation have been studied in detail and their application to various underactuated systems 
like, flexible-link and flexible joint robot manipulators have been demonstrated. 
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Chapter 6 


The Pendubot 

6.1 Introduction 

The pendubot is a two-link planar robot with an actuator at the shoulder (link 1) and no 
actuator at the elbow (link 2). The link 2 moves freely around link 1 and the control ob- 
jective is to bring the mechanism to the unstable equilibrium point. Block [29] proposed 
a control strategy based on two control algorithms to control the pendubot. For swing up 
control, he used partial feedback linearization technique and for balancing control, he lin- 
earized the system about the top equilibrium point and used Linear Quadratic Regulator 
(LQR) or pole placement techniques to stabilize at the top position. Fantoni and Lozano 
[27] utilized the passivity properties of pendubot and used an energy-based approach to 
propose a control law. They also provided complete stability analysis of their method. 
In this chapter, I tried to analyze Block’s method and made use of Neural Networks to 
stabilize the pendubot at its top unstable equilibrium point. Since this is an underactu- 
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Figure 6.1: The pendubot system 

ated system, one need to consider the energy aspects of the pendubot while designing a 
controller. 


6.2 NN based Partial Feedback Linearization 

Consider the pendubot as shown in the figure 6.3. The equations of motion for the Pen- 
dubot can be found using Lagrangian dynamics [27]. The equations can be written in 
matrix form as follows; 

D{q)q + C{qA)i + 9i(l) = '^ (6.1) 

where 
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D{q) = 

$1 + 92 + 29zcos{q2) 

92 + 9zcos{q2) 
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$2 + 9zcos{q2) 

92 
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and 


C{qA) = 


-dzsin{q2)q2 -0zsin{g2)qi 
0zsin{q2)qi 0 


9{q) 


L 

j 


digcosiqi) + ezgcos{qi + qz) 

T = 

n 

ezgcosiqi + qz) 


0 


6 i = milh + mzll + h 62 = + h 

Oz = rnzhlci &4 = mild + mzh h = mzlci 


The various variables used above are: mi and m 2 are the masses of link 1 and 2 respec- 
tively, li and I 2 are their corresponding lengths, Ici and Zc2 are the distances to the center 
of masses of the respective links, Ji and I 2 are the moments of inertia of the two links 
about their centroids. 


6.3 Swing Up Control 

The equations of motion of Pendubot are given by (6.1). Performing the matrix and vector 
multiplication, the equations of motion are written as 

+ <^12^2 + ciiqi + Ci2(?2 + (6-2) 

0 = ^ 21^*1 + (I 22 Q 2 + C2iqi +92 ( 6 . 3 ) 
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Due to the underaetuation of link 2, we can not linearize dynamics about both degrees 
of freedom. We can however linearize one of the degrees of freedom. In the case of 
Pendubot, %ve lineanze about r/i . The equation (6.3) is solved for link 2’s acceleration 

(h = - C 2 iqi - g2 

d22 

Substituting q from the above equation into (6.2), we get 

D = duQi + cnqi + ci 2 g 2 + g\ (6.4) 


with 

C12 = C!2 91 = 51 - ^ 

In terms of liltered error, (6.4) can be written as 


dll f = di 1 (</rfi + Ae) + cii (4di + Ae) + C 1252 + 3i “ cur - d 


= F - Cur - Ti 


(6.5) 


where 


r = e + Ae, e — qn- 5i 
F dn(</,n + Ae) + Cn{qdi + Ae) 4- ci 2«2 + gi 
The nonlinear function F can also include friction terms which we have neglected in the 
model (6. 1 ). Moreover, various pendubot parameters may not be known quite accurately. 
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Figure 6.2: NN Controller for Pendubot 


We can approximate this nonlinear and uncertain function using a 2-Iayer Neural Network 
[25, 39] . The NN controller is shown in Fig. 6.2. A control input can be chosen as 

Ti = jP + k-[r (6.6) 


The closed loop dynamics for qi is given as 

diir = {F — F) - (cii + ki)r (6.7) 

The closed loop dynamics is stable provided ||i^ — .P|| < e. The weight update algorithm 
proposed by Kwan et al [25], ensures the boundedness of weights as well as error signal 
{F — F). If the NN approximation holds good then one can see that the closed-loop 
dynamics (6.7) becomes a linear one. Since only one link is being linearized, it can be 
considered as partial feedback linearization. 

By giving a step trajectoi 7 , it is easy to bring the first lidk to the top position, but we 
need to pump sufficient energy to the second link so that it rotates around its own axis. 
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With this objective in mind, we choose following swing up trajectory for the link 1 



Arc for t <2it 
'Kjl for t>2'K 


Once the link 2 comes closer to the top unstable equilibrium point we switch over to some 
linear control as mentioned next. 


6.4 Balancing Control 

We linearize the Pendubot’s nonlinear equations of motion about the top unstable equi- 
librium position (qi = 7r/2, 52 = 0)- The linear model for the top position is as follows 

x = Ax Bu (6.8) 

where 



0 

1 

0 

0 


0 

A = 

51.9265 

0 

-13.9704 

0 

B = 

15.9549 


0 

0 

0 

1 


0 


-52.8402 

0 

68.4210 

0 


-29.3596 


By using pole-placement technique, we design a state feedback controller u = —Kx to 
stabilize the system around its equilibrium point. 
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Figure 6.3: NN based Partial Feedback Linearization control of Pendubot 
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6.5 Simulation 


The pendubot parameters are Ox = 0.0799, 0i = 0.0244, Oz = 0.0205, 0/x = 0.42126, 
Oz = 0.10630 and g = 9.8. The swing up control (6.6) is used to bring the link 1 to 
the top position, once the link 2 reaches close to the top position (Ig^i - 9 i| < 0.2, 
Ite -gi\< 0.2), we switch over to a linear state feedback controller where the state 
feedback gain is given by 

K = [^32.68 - 7.14 - 32.76 - 4.88] 

The experimental results are shown in figure 6.3. 

Remarks: 

• Neural Networks help in taking unmodelled dynamics like friction into account 
while designing controller and thus simplifies the numerical computation required. 

• Control input is bounded. 

• The initial trajectory for link 1 has been decided heuristically. One has to make sev- 
eral trials with the real system to come up with a suitable trajectory which will take 
the pendubot to its upright position in a smooth manner. Fantoni et al [27], have 
proposed a energy based method, where they bring the second link into homoclinic 
orbit. One may investigate along that line to come up with a smooth trajectory. 
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6.6 Summary 


The pendubot is an example of underactuated system where the number of degrees of 
freedom is less than the number of actuators. A controller based on NN is proposed in 
this chapter. The NN is used to approximate the inherent nonlinearities and unmodelled 
dynamics of the system thereby relieving the designer from tedious mathematical com- 
putations. Kwan’s weight update algorithm [25] ensures the boundedness of weights as 
well as tracking error. This controller has some resemblance to Block’s work [29], in the 
sense that we too use partial feedback linearization for link 1. In his work, two loops 
have been used for swing up control, the outer-loop being a PD control with feedforward 
acceleration which achieves the trajectory tracking of link 1 and the inner-loop is partial 
feedback linearization loop which linearizes the link 1 dynamics. He too selected a tra- 
jectory in the beginning for link 1 in order to excite the link 2 dynamics which was based 
on his experience with his hardware set up. In our case, we have only one control loop. 
The shortcoming of our algorithm is that one has to try a number of trajectories in the 
initial stages which not only excites the link 2 but also brings it closer to the top position. 
Further investigations may be done in this aspect to make it a robust control strategy. 
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Chapter 7 


Conclusion 

7.1 Contributions 

In this thesis, system identification and control of nonlinear systems using neural net- 
works have been studied and analyzed. Some of the contributions of this work are as 
follows: 

• Extended Kalman Filtering (EKF) has been used to train a Memory Neuron Net- 
work (MNN) for the first time. Performance comparison among three algorithms 
namely back propagation through time (BPTT), real time recurrent learning (RTRL) 
and EKJF for this network has been done in identifying both SISO as well as MIMO 
systems. 

• A novel learning algorithm based on Lyapunov stability theory has been proposed 
for feedforward networks. Its performance has been compared with existing BP 
and EKF algorithms. The nature of adaptive learning rate has been analyzed in de- 
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tail. A modification has also been suggested to speed up the convergence. Through 
simulations, the proposed algorithm has been shown to give faster convergence. 
Various system identification issues have also been discussed for the proposed al- 
gorithm. 

• Some of the recent NN based adaptive and robust control techniques have been 
studied in detail. Two methods namely NN based robust backstepping control and 
NN control based on singular perturbation have been used to control various robot 
manipulators. Through simulation results, it is shown that NN controllers give 
substantial improvement in the performance. 

• A new NN control based on partial feedback linearization has been suggested for 
pendubot. It is shown that some of the unmodelled dynamics like friction can be 
taken into consideration by u.sing NN. 


7.2 Scope of Future work 

• Finding a simple learning algorithm for updating weights in neural networks, which 
is faster in terms of convergence time and computational complexity, is still an 
open research problem. A lot of modifications can be brought about both in the 
architecture as well as the learning algorithm to achieve faster convergence. 

• Usually, feedforward networks are used for controlling nonlinear dynamical sys- 
tems like robot manipulators, owing to its ease of implementation. But feedfor- 
ward networks have certain limitations, for instance, they need complete knowl- 
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edge about the states of the system. It is seen that MNN structure provide the 
capability of RNN along with the simplicity of MLR Thus MNN may be used for 
implementing online controllers for robot manipulators. 

The proposed LF algorithm has been shown to be fast and easy to implement. Here, 
this algorithm has been applied only for system identification. So, one may also use 
it to design online controllers. 

The suggested NN based control for pendubot need to be tested on a hardware set 
up. 

Neural Network and machine leaiming approaches have not been applied to under- 
■ actuated mechanical systems. If will be interesting to see if these concepts can be 
used to design meaningful controllers for this class of nonlinear systems. 



Appendix A 


Deft nitions and theorems 


A.l Barbalat^s lemma 


If the differentiable function f{t) has a finite limit as f > oo, and if / is uniformly 
continuous, then /(f) — > 0 as f oo. 

A function is said to be uniformly continuous on [0,oo) if VK > 0, 3 r]{R) > 0, 
Vfi > 0, Vf > 0, |f - fi| < 7? - c?(fi)| < R. A sufficient condition for a 

differentiable function to be uniformly continuous is that its derivative be bounded. 


Corollary A.l if the differentiable junction f{t) has a finite limit as t -* oo, and is such 
that f exists and is bounded, then f{t)-^0ast-^0. 
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A.2 Strictly Positive Real Systems 


A transfer function h{p) is positive real if 

Re[h{p)] > 0 Vi?.e[p] > 0 

It is Strictly positive real if h{p - e) is positive real for some e > 0. 

Theorem A.l A transfer function h{p) b strictly positive real (SPR) if and only if 

• h{p) is a strictly stable transfer function 

• the real part ofh{p) is strictly positive along the ju) axis, le., Vw > 0, Re[h{jcj)] > 
0 

The above theorem implies following necessary conditions for asserting whether a given 
transfer function h{p) is SPR: 

• h{p) is strictly stable 

• The Nyquist plot of lies entirely in the right half complex plane. 

• h{p) has a relative degree of 0 or 1 

• h{p) is strictly minimum phase 

A.3 Zero Dynamics 

The part of system dynamics, which can not be seen from external input-output relation- 
ship, is known as the internal dynamics of a system. For linear systems, the stability of 
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the internal dynamics is simply determined by the location of zeros. This concept can not 
be extended to nonlinear systems. 

The zero dynamics is defined to be the internal dynamics of the system when the 
system output is kept at zero by the input. Two useful remarks can be made about the 
zero dynamics of nonlinear systems. 

• the zero dynamics is an intrinsic feature of a nonlinear system, which does not 
depend on the choice of control law or the desired trajectories. 

• the stability of internal dynamics of nonlinear system can be judged by examining 
the stability of zero dynamics. 

A system with unstable zero dynamics is called a non-minimum-phase system. 

A.4 Persistent Excitation 

By persistent excitation of a signal vector v{t), we mean that there exist strictly positive 
constants cti and T such that for any t> 0, 

rt+T 

/ v^vdr > a\I ■ 

Intuitively, the persistent excitation of v{i) imphes that the vectors v{t) corresponding to 
different times t cannot always be linearly dependent. 
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